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ABSTRACT 



This thesis presents several interactive computer programs for the analysis of 
multivariate data. A special case is that of panel data; multiple time series of short 
length. The first program, BOXPLOTAB, handles this type of multivariate data; it is 
an enhancement on an existing graphical technique for exploratory data analysis 
known as BOXPLOTS. The program works by appending boxplots as column dividers 
in a table of the raw data which originates the box plots. This combination of the raw 
data and the graphical representation of that data improves the understanding of the 
characteristics of the data in exploratory and descriptive applications; differencing and 
tracing of data through the table is also implemented. This thesis also presents and 
explores the use of other graphical techniques for exploratory data analysis of 
multivariate data such as STAR plots, PROFILE plots, CODED SCATTER plots and 
CODED DRAFTSMAN plots. These techniques are examined and implemented in a 
series of computer programs which produces these graphical displays. A technical 
description of each computer program is presented and user implementation procedures 
are discussed. The programs are implemented in APL and run in conjunction with the 
experimental IBM APL Graphics program GRAFSTAT. To demonstrate the use of 
these techniques, an analysis is conducted on several sets of multivariate data. 



THESIS DISCLAIMER 



The reader is cautioned that computer programs developed in this research may 
not have been exercised for all cases of interest. While every effort has been made, 
within the time available, to ensure that the programs are free of computational and 
logic errors, they cannot be considered validated. Any application of these programs 
without additional verification is at the risk of the user. 
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I. INTRODUCTION 



A. PREFACE 

One of the main problems in experimental statistics and experimental design is 
the exploratory analysis of raw data. This problem is greatly enlarged when the data 
presented to the statistician comes from unknown multiple populations or so called 
multivariate data. A special case is that of panel data; multiple time series of short 
length. The initial purpose of the data analysis is to try to capture the most important 
distributional characteristics of the data such as the range, location and spread of the 
data points. For the experimental statistician the main tool available for the analysis of 
the data is the graphical display of the marginal distributions of the data in order to 
visualize and gain better understanding of these characteristics and to compare them 
against those of the different populations. Following this, interactions or dependencies 
can be examined, and this is the domain of multivariate data analysis. 

B. PURPOSE 

The purpose of this thesis is twofold: first, to add to the tabular display of the 
original multivariate data an existing graphical technique known as the BOXPLOT (see 
[Ref. 1] ). This addition can be done in several alternating ways and is done in order to 
better understand the populations and the relations between the different populations. 
The second purpose of the thesis is to make available different computer programs to 
exploit several other enhanced statistical graphical techniques for multivariate data 
analysis. These techniques are: STAR plots, PROFILE plots, CODED SCATTER 
plots and CODED DRAFTSMAN plots. 

C. BACKGROUND 

Presently, the BOXPLOT technique is one of the most common graphical 
techniques used by data analysts, both outside and at the Naval Postgraduated School 
(NPS). There are different software packages that provide these plots, some of which 
are in the experimental IBM APL GRAFSTAT program and some in the IBM 
Mainframe NONTMSL library. 

One of the most important limitations of this graphical display technique is the 
absence of the raw data in the display; this absence is critical in the special case of 



10 



multiple box plots for the comparison of multiple sets of data. This would provide an 
inmcdiate identification of peculiar characteristics of the data, such as pin-pointing the 
outliers and the variability and/or relation of a sample datum with respect to other 
samples. Once the BOXPLOTS are displayed on the screen (or printed on a graph), the 
analyst has to go back to the original raw data in order to identify these data points. 
This limitation is overcome by the new technique presented herein, which is called the 
BOXPLOTTED TABLES. This technique has already been implemented, and can be 
used in the NPS IBM 3033 computer using an APL (A Programming Language ) 
program, which make use of the graphical capabilities of the IBM experimental 
GRAFSTAT software package. In GRAFSTAT, an interactive technique for 
identifying odd or outlying points is given. This implementation highlights the 
importance of a technique for data point identification. However, one docs not always 
have access to such a program and the ability to do this identification on a printed 
page is important to a data analyst. The BOXPLOTTED TABLES do precisely this. 
Note too that a primary concern in multivariate data analysis is to get as much 
information on a two dimensional page as possible. Thus having tabular and 
distributional data together on one page, as in the BOXPLOTTED TABLES, is a step 
in this direction. 

There are other graphical techniques commonly used by data analysts such as : 
SCATTER plots, STAR plots, PROFILE plots, CODED SCATTER plots and 
CODED DRAFTSMAN plots (see [Ref. 1] ). These techniques are mainly used to 
enhance the interpretation and understanding of displayed multivariated data. Out of 
these, the DRAFTSMAN and SCATTER plot techniques (without coded symbols) are 
the only ones available at NPS up to this point. These other techniques arc used to 
display the data points in many different forms, giving a new perspective to the 
interpretation of the original data. 

This thesis present a group of APL functions that will make possible the use of 
these graphical techniques to the experimental statistician at the NPS. Various 
examples that show how to use this software to analyze and graphically display sample 
data will be shown in the following chapters of this thesis. 

D. ORGANIZATION 

This thesis consists of three main blocks. The first one, Chapter Two, is dedicated 
to explain the technical aspects of these graphical techniques; the mathematical and 
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statistical attributes of each technique are treated in this chapter. The second block is 
intended to introduce the user to the APL software code used to implement these 
techniques. In Chapter Three, both the user and system requirements are explained; 
and for the more technical oriented reader, a listing of the APL code is listed in 
Appendix A. In addition, several examples of program execution arc listed in 
Appendix B. The last block, composed of Chapter Four and Appendix C, is dedicated 
to the exploratory analysis of several sets of sample data to demonstrate some of the 
potential applications of these graphical approaches to statistical analysis. 
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II. GRAPHICAL TECHNIQUES 



A. BOXPLOTTED TABLES 

1. Overview 

The BOXPLOT graphical technique was first conceived by Tukcy as a method 
to display an almost one-dimensional summary of the distribution characteristics of a 
set of data, Chambers [Ref. 1] provides an excellent analysis of this technique. This 
display shows some of the most prominent characteristics of the sample distribution 
such as the median, mean, the inter-quartile range and the outliers, if there arc any. In 
the case where the sample comes from multivariate data, the BOXPLOT is used not 
only to show the individual characteristics of each subsample, but, in addition, to 
compare the behavior of these characteristics with respect to other samples (see 
[Ref. 1: p. 89] ). Figure 2.1 shows a BOXPLOT display. The BOXPLOT's almost 
one-dimensional character, as opposed to the two-dimensional character of the familiar 
histogram, facilitates comparison of the marginal properties of multivariate data sets. 

One of the limitations of the BOXPLOT is that of the identification of specific 
values of interest such as outliers; if the identification of the outliers is the prominent 
feature, the statistician must make reference to the original data in order to identify 
which data point the outlier correspond to. 

A solution to this limitation, suggested by Professor P.A.W. Lewis in an 
unpublished work, is to show the original data and the BOXPLOT in the same 
graphical tabular display. In this case, the BOXPLOTS are shown as dividers of the 
original tabulated data, so that aberrations arc readily apparent and checkable (sec 
Figure 2.2). This technique clearly requires the availability of high resolution graphics 
and a sophisticated plotting and data manipulation package. This requirement is met 
by the experimental APL GRAFSTAT program from IBM Research which is being 
used at the NPS on a test bed basis. 

2. Technical details of BOXPLOTTED tables 

In the BOXPLOT the top and bottom of the rectangle represent the upper 
and lower quartile of the data respectively. Therefore, the length of the rectangle 
represent the inter-quartile range ( Q(.75) - Q(.25) = 1QR ), where Q(a), for 0<«< l 
is the «-quantile of the sample. The mean of the data sample is shown by a small 
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Figure 2.1 BOXPLOT of California Hospital Data (Per Capita, 
Hospital Expenses, Years 1971-1975, in 14 Health Service Areas). 




Figure 2.2 BOX PLOTTED Table of California Hospital Data (Per Capita 
Hospital Expenses, Years 1971-1975, in 14 Health Service Areas). 
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circle, and the median by an asterisk inside the rectangle. The solid lines going out 
from the top and bottom of the rectangle represent the adjacent values. These values 
are defined as those data points greater or equal than Q(.75) and less or equal than 
Q(.75) + 1QR for the upper line, and those values less or equal Q(.25) and greater or 
equal than Q(.25) - IQR for the bottom line. Those data points that fall in the range of 
[(Q(.25) - IQR) , (Q(.25) - IQR*1.5)] or [(Q(.75) + IQR) , (Q(.75) + IQR*L5)] are 
called outliers and are represented by small light circles. The data points that fall 
beyond the ranges of these outliers are called extreme outliers and are represented by 
small black circles. As an example, for normally distributed data, approximately 5 
percent of the points should be outliers and marked with light circles and only about 
0.5 percent should be extreme outliers (see [Ref. 2] ). 

To obtain a BOXPLOTTED table, a tabular display of the data is added to 
the BOXPLOT display. At the bottom of each column the estimates for the mean, 
median, variance and the rank correlation between that column and the next column to 
the right are listed. The estimators for these parameters are defined as follows : 

Let X-j be the entry in the i^ 1 row and j 1 * 1 column, and let n be the number of 
values in each column. Then 

Mean; = Xj = Xjj / n. (2.1) 

The Median is defined as follows. Let MIDj be n/2 if n is even, and the largest 
integer smaller than n/2 if n is odd. Then 

Medianj = X j(MIDj), if n is odd, or (2.2) 



Median^ = X*(MIDj) + X*(MIDj+l) / 2, (2.3) 

jfc ♦ U 

if n is even, where X j represents the j column sorted in descending order. The 
estimator for the variance is 



Variancej = £ ( X ;j - Xj ) 2 / ( n - 1). 



(2.4) 
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Finally, the Spearman's p (RIIO) Rank Correlation coelTicient is defined as 
follows: let Xj and Y- be two sets of data, and let [R(X-)] and (R(Yj)] be the ranks of 

Xj and Yj as compared to the others X and Y values respectively, for i= l,2,...,n. 

R(Xj) = 1 if Xj is the smallest of Xj,X,,...,X n , and R(Xj) = 2 if Xj is the second 
smallest, and so on, with rank n being assigned to the largest of the Xj. The same 
applies for R(Yj). When assigning the ranks, if a tie is found (when two or more 

sample values are exactly equal to each other, they are tied), assign to each tied value 

the average of the ranks that would have been assigned if there had been no ties (see 
[Ref. 3: p. 252] ). Then the estimator is 

Pj = lilR(Xj) - (n+ 1 )/2][ R( Yj) - (n+ l)/2j / [(n(n 2 - 1))/12], (2.5) 

if there arc no tics in the data. If there are ties in the data, then the estimator is 

Pj = YjR(Xj)R(Y j) - n((n+ (2.6) 

(V iR (Xj) 2 - n((n+ l)/2)Y7MljR(Yj) 2 - n((n+ l)/2) 2 ] 1 / 2 ' 

Once the BOXPLOTTED tables for the original data are obtained, it is then 
possible to join with lines, values with the same rank (order in magnitude) within their 
corresponding columns. The statistician may select to use this technique when it is 
desirable to study any possible relation with respect to time among variables (as in the 
case of multiple short time scries), or with respect to magnitudes (as in the case of data 
with mixed qualitative and quantitative information). 

If the data are ordered (in descending order on the first column), then the 
outlier in the first boxplot corresponds to the first value in the table. However, this 
ordering may be lost in the second column, so that it is not clear that an outlier in the 
second boxplot corresponds to the first value in the second column and so forth. Thus 
if a line is drawn linking the largest value in each column, two extreme results are 
informative. If the line is straight (or almost straight), it means that the outliers in 
successive columns come from the same source. If the line wanders, then there is no 
structural relationship along columns (or time, if one considers panel data). As an 
example, the study of health care expenses is a good prototype of multiple short time 
series analysis. 
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Figure 2.3 BOXPLOTTED Table with Joining Lines 
(Per Capita, California Hospital Expenses, Years 1971-1975). 




Figure 2.4 BOXPLOTTED Table of Relative Differences, First col. 
(Per Capita, California Hospital Expenses, Years 1971-1975). 
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In Figure 2.2 one could draw a line joining the highest or lowest expenses 
through time and see to which hospitals they belong. The reader could now make 
reference to Figure 2.3. In this figure one could see the transition of health care cost 
for the most and least expensive health service areas in California through the period 
of 1971 to 1975. A plot with the connections is shown in Figure 4.2. 

In addition to this option, the user can display the differences between the 
values of the columns. These differences could be relative to the first column (base 
column) or with respect to the previous one. When the statistician is dealing with panel 
data, it is desirable to study the trend of relative (or absolute) rate of change in the 
data points. Again, in the analysis of health care expenses, one may want to study the 
relative (or absolute) rate of change in this cost through a given period. In Figure 2.4, 
it is possible to infer that the relative change of health care expenses is not linear 
within the period of study. This inference would be enhanced by a plot of differences, 
as is done in Chapter Four. It is also possible to readily identify those health services 
areas that had the extremes rates of change. An analysis of this data is presented in 
Chapter Four. 

B. STAR PLOTS 
1. Overview 

In working with multivariate data, one of the key problems is how to 
represent more than two variables (dimensions) in a single display. There are several 
graphical approaches to deal with this problem, as mentioned in Chambers [Ref. 1]. 
Four of these techniques are treated in this thesis : STAR plots, PROFILE plots, 
CODED SCATTER plots and CODED DRAFTSMAN plots. 

In the STAR plot each subpopulation is displayed by a star in which each arc 
(or ray) represents a variable of interest. The value of each variable is coded by the 
length of the corresponding arc; to avoid overlapping between arcs, these are portrayed 
symmetrically about the origin. This can be seen in Figure 2.5 and Figure 2.6, in which 
some characteristics of automobile data are displayed (a complete description of the 
data is presented in Chapter Four). In Figure 2.5, twelve variables of interest are 
assigned to the rays of the star (i.e., price, length, etc.). In Figure 2.6, the same 
representation is used to portray the same information but for several automobile 
subpopulations (models). It is now easy to graphically compare these characteristics 
(by the corresponding length of each ray) among the different subpopulations (models). 
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Figure 2.5 Assignment of Variables 
to the Rays of the STAR (Automobile Data). 




Figure 2.6 STAR Plot of the Automobile Data. 
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2. Technical details of STAR plots 

There arc two essential features in the construction of the STAR plot; the 
lengths of the rays and the angle between the rays. As stated in the last section, the 
value of the variables are represented by the length of the rays; therefore, these values 
should be non-negative and be represented using a similar scale. This is accomplished 
by initially rescaling the value of the variables using the following formula : 

X jj = [( 1-c )( X;j - min- ) / ( MaXj - min- )] + c , (2.7) 

where c is a constant and is usually given a value of zero; and X- represents the i 1 * 1 
observation of the j 1 * 1 variable. The coefficients min: and Max: represent the minimum 

f J J 

and maximum value of the j m variable respectively. Once the rescaling of the variables 
is performed, the angle between the rays must be determined. The first variable 
(variables are enumerated in increasing order) is plotted on the horizontal axis at an 
angle of zero degrees. Then the j in angle between the remaining variables is calculated 
using the following formula : 

<Oj = 2 ji ( j - I ) / n , (2.8) 

where n represents the number of variables (parameters), and j is the variable. The 
rays are then enumerated from 2 to n and displayed counterclockwise. Finally, the star 
is constructed by joining the end points of the n rays. The end point of each ray is 
calculated by the following formula : 

P|j = (X - R cos to , X ^ R. sin to ), (2.9) 

tV* 

where j = 1,2,..., n in variable, and R is the maximum allowable radius of the star. 

C. PROFILE PLOTS 
1. Overview 

The PROFILE technique is similar in nature to the STAR plot, the only 
difference is that in the PROFILE plot the rays are displayed by equidistant vertical 
lines arising from a common horizontal axis. In fact, as stated in Chambers [Ref. 1: p. 
159], the STAR plots are actually PROFILE plots conceived in polar coordinates. In 
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the PROFILE plots, the values of the variables are used to control the length of the 
ends of the connected line segments (see Figure 2.7 and Figure 2.8). 

One of the possible advantages of the PROFILE plots over the STAR plots is 
that in the former it is possible to represent variables with negative values. Since in the 
PROFILE plots all value-vectors arc displayed with respect to a horizontal axis, it is 
then easy to show variables with negative values. The base line, the horizontal axis, is 
used to represent zero and negative values are displayed by lines dipping below this 
line. In the STAR plots this is not possible. 

2. Technical details of PROFILE plots 

In the PROFILE plot the rescaling is performed using the same formula as for 
the STAR plot. Negative values of the variables are allowed by using the following 
rescaling formula: 

X*ij = ( X-j / MaXj ). (2.10) 

D. CODED SCATTER PLOTS 

1. Overview 

The CODED SCATTER plot is an enhancement of the most commonly used 
technique namely a SCATTER plot for two variables. Using this coding technique it is 
possible to represent more than two variable (dimensions) in the same display. 
Different symbols, and sizes and colors of these symbols, are used to represent three or 
higher dimensional data. The size and color of the symbols could be used to control 
different ranges of data values. 

2. Technical details of CODED SCATTER plot 

The CODED SCATTER plot uses essentially the same plotting technique as 
the usual SCATTER plot; only coded symbols, sizes and colors are added. This is in 
line with the need to represent as many dimensions as possible from a multivariate data 
set on a two dimensional graph. Thus, in a CODED SCATTER plot the position of 
the points in the graphical plane are represented by the (X,Y) values of the two 
variables. Then, if X is the miles per gallon variable in a data set and Y is the price of 
the car, plotting Y vs X shows how gas consumption increases or decreases with 
increasing cost of a car, or that there is a much more complex relationship between the 
two variables, or that there is even no relationship at all. However, there arc other 
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Figure 2.7 Assignment of Variables to the Lines of the PROFILE 
Plot (Automobile Data). 




Figure 2.8 PROFILE Plot of the Automobile Data. 
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variables or factors involved in the relationship. These may be either continuous, 
discrete or categorical factors. An example of the first is the weight of the car, an 
example of the second is the number of cylinders in the car. A categorical factor is 
origin, i.e. whether the car is domestically produced or not. 



USA = A, FOREIGN = F AND WEIGHT ■= SIZE OF LETTER 




Figure 2.9 CODED SCATTER Plot of the Automobile Data. 

(Price vs M.P.G. City). 

Figure 2.9 shows a CODED SCATTER PLOT of the car price variable, X, 
versus the miles per gallon variable, Y. The best way to code the origin of the car is by 
using colors; however, due to reproduction problem, this has been encoded as symbol 
type. The weight of the car is coded as the size of the symbol. In Figure 2.9, one can 
see that increasing price gives lower m.p.g., although the relationship is far from linear. 
The other factor is weight; weight clearly increases with price, also m.p.g decreases with 
weight. Again, referring to the categorical variable, origin, American cars cost more 
than foreign cars, weigh more and get less mileage, although there is interaction and 
overlap between all of these variables. There are also a few outliers. The very high 
mileage, low cost, and light weight car is the VVV Rabbit (Diesel) and the very heavy, 
low mileage, and high cost car is the Cadillac Seville. 
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E. CODED DRAFTSMAN PLOTS 



1. Overview 

The DRAFTSMAN plot is an arrangement of SCATTER plots in which any 
adjacent pair of plots have a common axis (see [Ref. 1: p. 145] ). In this way, the 
practitioner can observe the relationships of the variables within a specific plot and, in 
addition, can follow any particular observation (or group of observation) through the 
sequence of plots. Therefore, the analysis of multiple interactions among the variables 
is possible. This DRAFTSMAN plot can further be enhanced by portraying one or 
several additional variables by the assignment of symbols, sizes of the symbols, and 
colors to the already displayed variables. This is the main idea behind the CODED 
DRAFTSMAN plot, in which the techniques used in both DRAFTSMAN and 
CODED SCATTER plots arc combined to display a single plot. 

2. Technical details of CODED DRAFTSMAN plot 

The CODED DRAFTSMAN plot, as mentioned earlier, can be seen as a 
displayed array of several CODED SCATTER plots. In certain situations the 
SCATTER plots may be deceiving due to the overlapping of data points. This situation 
may require to jitter one or more variables in order to alleviate this problem. Also, the 
practitioner may want to transform the data in order to achieve a simple and more 
understandable picture and in this way facilitate the analysis of the relationships 
among the variables (see [Ref. 4] ). Another technique used by statisticians to reduce 
the spread in the data, and to enhance the visual interpretation of the plots is to 
smooth the data , relying on the Moving Averages technique or the Locally Weighted 
Regression ( LOWERS ) for this purpose. For further explanation about these two 
techniques see Moran [Ref. 5|. These techniques (jitter , transformation, and smoothing), 
are included in the CODED DRAFTSMAN program presented in this thesis, and they 
can be invoked interactively. 
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III. COMPUTER PROGRAMS : USER INSTRUCTIONS AND 
TECHNICAL DESCRIPTION 



A. GENERAL 

This chapter provides detailed instruction on how to use the computer programs 
presented in this thesis. These programs were written in APL and arc designed to be 
used in conjunction with the experimental IBM graphical software GRAFSTAT. All 
these programs are interactive and all user defined parameters and options selections 
are entered in response to program queries. Although no APL skills are required to 
operate these programs, it is recommended that the user becomes familiar with APL 
system commands and procedures to load and copy workspaces, groups and variables, 
and to understand the meaning of workspace, variables, groups and vectors in the APL 
terminology. The user should read VS APL AT NPS , [Ref. 6] before attempting to use 
these programs. For the experienced APL user it will be easy to make changes to these 
programs in order to accommodate any additional requirement. 

These programs were designed to be used on the IBM 3033 computer and to be 
executed using an IBM 3277/TEK 618, 3278/3279 or 3179G2 graphic display terminals 
using a memory capacity of at least 2 Megabytes. 

All of these programs are contained within an APL workspace called 
APLGRAFS, and arc organized in different groups of functions (each group contain 
those functions related to a specific program application). The reader can find a list of 
Groups and the content of each group of functions in Appendix A2. In order to make 
use of these programs, the user must have access to this workspace and to the APL 
workspace called GRAFSTAT. 

There arc two ways of executing these programs. The first one is described in the 
following steps : 

1) LOGON to the system. 

2) Once in CMS, enter APLGST. 

3) At the prompt CLEAR WS, type )LOAD GRAFSTAT. 

4) Enter ) PCOPY APLGRAFS groupname, where groupnamc is one of the groups 
listed in Appendix A2. 

5) Enter the name of the desired program to be executed (i.e., STARPLOT ) and 
then answer the queries. 

The second mode is more user-friendly. The steps that must be followed are : 

1) LOGON to the system. 
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2) Once in CMS. tvpe APLGRAFS, (this will cause the execution of the macro 
APLGRAFS EXEC), then you will see a menu describing all the available 
programs (see Figure 3.1 ). 

3) After you enter the number corresponding to the selected program, you onlv 
have to follow the instructions given on the screen (this macro will execute 
steps 2), 3), and 4) of the previous list for you). 



FILEi MENU NENU A1 



YOU HAVE THE FOLLOWING PROGRAMS TO USE 

(1) STAR AND PROFILE PLOTS 

(2) 80X PLOTTED TA8LES 

(5) SYMBOLIC SCATTER PLOTS 
(«> DRAFTSMAN DISPLAY 

15) LOWESS 

(6) EXPLANATION ON THESE FUNCTIONS 

(7) QUIT 

TYPE THE NUM8ER CORRESPONDING TO THE PROGRAM YOU WANT 



Figure 3.1 Menu Presented by APLGRAFS EXEC. 

B. PROGRAM DESCRIPTION 

In order to use any of these programs, the user will need a matrix containing the 
sample data. This matrix could be in a CMS file, or could be a character array in an 
APL workspace. The programs will accept the data in either way; just follow the 
instructions given by the program as to the location of the data set. In addition, the 
user will need an APL two-dimensional character array containing the names of the 
variables which will appear in the display. These names are the labels which will be 
shown on the axis of the plots or in the rows and columns of the tables as in Figure 
2.2 and 2.4. If the user has not previously created this array, the programs will allow 
the user to enter the labels directly in response to a sequential series of queries. At this 
point, the user is ready to execute any of the programs. 

When answering the queries, if the user enter an erroneous response, the program 
will prompt the user to enter the correct response only in the case of a YES or NO 
question, a range question (i.e., 3,4, or 5 plots), if the name of any APL matrix docs 
not exist in the workspace, etc.; in all other cases, the program does not have any 
means to know the validity of the response so the program will accept any response as 
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a correct one. When using these programs, if the user wants to cancel the execution at 
any time during the execution, the user must hit the PA2 key. 

1. BOXPLOTTED TABLES 

This program is executed by entering the command BOXPLOTAD. Once this 
command is entered, the program will start running by prompting the user with a 
sequence of queries indicating the user to enter the input arrays and to select the 
different available options (see Appendix Bl, for an example of program execution). 

a. Input requirement. 

(1) The arrav containing the sample data,, the array containing the names (labels) 
of the columns , anti the array containing the names (labels) of the rows. 

(2) The title of the display. 

b. Options. 

(1) The data could be displayed as originally entered or could be displayed 
ordered (sorted) by the first column. 

(2) Once the BOXPLOTTED Tables are shown on the screen , the user will be 
prompted as to whether or not he or she wants to join the data points of the 
same position with lines. The position is given by the order of the data points 
of the first column (see Figure 2.3). 

(3) After finishing with the previous display, the user will be prompted whether or 
not he or sne wants to see BOXPLOTTED TABLES of the differences 
between columns; and if it is so, whether absolute or relative differences are 
desired. The difference could be calculated as follows : difference between all 
other columns with respect to the first one, or difference between adjacent 
columns (see Figure 2.4). 

2. STAR PLOTS and PROFILE PLOTS 

These two programs are executed by entering the command ST ARP LOT. The 
program will start running and the user will be asked to enter the desired function : (S) 
for STAR PLOT or (P) for PROFILE PLOT (see Appendix B2, and B3 for an example 
of the execution of this program). 

a. Input Requirements. 

(1) Same as for the BOXPLOTTED TABLES. 

b. Options. 

(1) Whether the whole original data is to be used or just a subsample of the data. 
The subsample could be constructed by selecting specific columns and/or 
rows. 

(2) The user will be asked how many plots per screen are desired. This could be 
3,4 or 5 plots per screen. 

3. CODED SCATTER PLOT 

The execution and the input requirements of this program are similar to that 
of the STAR PLOT. To execute the program enter the command SCATPLOT (see 
Appendix B4 for an example of the execution of the program). 
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a. Options. 

(1) The user will be asked to enter the title for the screen and the title for each 
plot on the screen. 

(2) For each plot, the user must enter the column to be used on the X and Y axis, 
and whether or not the entire data or a subsample of it is desired. 

(3) Another option is to select whether the data is to be jittered or if a 
transformation of the data (specified by the user) is desired. 

(4) Following this option, the user must select the position of each plot on the 
screen (1, 21, 22, ..., etc.). 

(5) Finally the user must specify the symbols, colors and sizes of these symbols 
that will be used to represent'an specific subset of the data. If the user selected 
one plot per screen, the program will ask for a small description for each one 
of these subsets or categories. These subsets are denned using APL 
statements. The user can specify more than one subset in the same plot and 
more than one plot per screen (see Figure 2.9). 

4. CODED DRAFTSMAN PLOTS 

This program is executed by entering the command DRAFTSMAN. Once this 
command is entered, the program will start by prompting the user with a sequence of 
queries indicating the user to enter the input arrays and to select the different available 
options (sec Appendix B5, for an example of program execution). 

a. Input requirement. 

(1) The arrav containing the sample data, and the array containing the names 
(labels) of the columns. 

b. Options. 

(1) The data could be used as originally, entered, or could be jittered or 
transformed. The user must select the desired option. 

(2) Select whether or not a smoothed curve will be fitted to the data in all plots 
on the screen. If the smoothed curve is selected, the user must indicate 
whether the Moving Average or LOIVESS technique will be used. 

(3) Select between using the CODED DRAFTSMAN or the regular 
DRAFTSMAN plot, ff the former is selected, the user must enter an APL 
expression, a symbol and its size, and the color for each category to be 
represented. 

(4) Select the number of plots desired per screen (the available options are 3,4 or 
5 rows and columns of plots per screen). 

(5) Once the display is shown on the screcn.and if the answer to option (2) was 
no, the. user now has the alternative of fitting a smooth curve to the data of 
an specific plot. 
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IV. DATA ANALYSIS 



A. GENERAL 

The primary purpose of this chapter is to demonstrate the applications of the six 
graphical techniques presented in this thesis, namely BOXPLOTTED tables, STAR 
plots, PROFILE plots, CODED SCATTER plots and CODED DRAFTSMAN plots 
in the analysis of multivariate data. An attempt is made to highlight different 
peculiarities on the sample data that could be found when the practitioner uses these 
techniques; therefore, a full analysis of the various samples is not envisioned. However, 
it will be seen, that in using this techniques one can draw solid conclusions about 
certain behavior of the characteristics of the population from which the sample is 
drawn. 

B. AN ANALYSIS OF HEALTH CARE EXPENSES 

The following type of data represent a good example for which the statistician 
can make use of the BOXPLOTTED tables and the PROFILE plots. This is a sample 
of panel data and represents the health care cost (per capita hospital expenses) of 14 
health service areas through the State of California from the years of 1971 to 1975. 
Figure 4.1 is a BOXPLOTTED table which displays the average health care expenses 
of the areas. 

The data was formatted as a two dimensional array of 14 rows and 5 columns. 
Each row of the array represents the average health care expenses of a given area, and 
each column corresponds to the average expense for a given year. The data have been 
ordered in decreasing order by the first column (year), i.e., the service area with higher 
health care expenses on the first year correspond to the first row and so on. In general, 
the boxplots in Figure 4.1 give an initial impression of the distribution of each 
subsample data. Notice that during the first three years the tendency of the distribution 
is definitely skewed to the right, caused by some possible outliers, indicating that some 
service areas far exceed the average health care expenses. However, in the last two 
years the tendency is the opposite, with again the exception of some possible outliers. 

The initial impression given by the boxplots could further be exploited by an 
analysis of the flow of the data in order to study the trend of the mean health care 
costs, the variance of health care cost, and to identify the occurrence, or recurrence, of 
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Figure 4.1 Per Capita Health Care Expenses in 
California Health Service Areas. 




Figure 4.2 Per capita Health Care Expenses in California 
Health Service Areas (Most and Less Expensive Areas). 
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possible outliers. Another possible trend that could be study is that of the relative 
difference in health care expenses. One further area of interest is to study the trend of 
the most and less expcnsives service areas through this period. 

The tendency in the average health care cost during this period was expected to 
be an increasing one (during this time, among other things, the inflation rate was 
starting to increase very rapidly). This tendency can be seen in Figure 4.1. Notice that 
this change is apparently quite linear up to the year of 1974; in 1975 there is a big 
jump in the average cost which probably indicates that, overall, the trend in the 
average health expense through this period was not linear. This same tendency is 
present in the variance of health care cost, which seems to confirm the nonlinearity in 
the average health care expense during this period. In this figure and in Figure 4.2, 
where lines are used to trace the flow of the 1971 high and low cost areas through 
subsequent years, it is also possible to readily pinpoint those service areas of extreme 
average health care cost (possible outliers, as defined in Chapter II). Notice that, the 
health service area number 4 is shown as a possible outlier through all years; it is 
always at least 2.2<7 from the mean cost. The service area number 3 has the same 
tendency. These two areas are then the possible cause in the high variation observed in 
the health care cost through this period. They actually represent the Los Angeles and 
San Francisco metropolitan areas. In Figure 4.2, one could also follow those service 
areas with lower average cost (these areas are joined by line segments at the bottom of 
the display), it seems that these areas (number 1 and 14) had the lowest cost through 
this period, with the exception of service area no. 8 which has the lowest cost in 1971. 

One could further follow the trend in the change of health care expense for each 
respective service area by using the PROFILE plots. In Figure 4.4 each one of the 
profile plots portrays the values of each row (health care service area), and the values 
are ordered by the magnitude of the first column, as in the BOXPLOTTED tables. The 
values of each column arc represented in each profile according to the assignment 
given in Figure 4.3. In Figure 4.4 it is quite easy to identify the health service areas 
that had the highest and lower health care expenses during this period; as it was seen in 
the BOXPLOTTED tables, these areas are number 4 and 3, and number 1 and 14 
respectively. One could also readly pinpoint the area with more variability in health 
care expenses, in this case notice that area number 13 has greater change in health 
expenditure than areas number 1, 14 and 4 (this last being the most expensive). Notice 
that the highest variation in expenditure in area number 13 takes place during 1972 and 
1973. This fact could also be capture in Figure 4.5 in columns 2 and 3. 
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Figure 4.3 Assignment of Variables to the Profile of the 
Per Capita Health Care Expenses. 




Figure 4.4 Profile Plot of the Per Capita Health Care Expenses 
California Health Service Areas. 
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Now one could also reinforce the statement about the nonlinearitv in the change 
of the health expenditure by looking at the relative difference in this variable through 
this period. Figure 4.5 portrays the trend in the relative difference in health expenditure 
with respect to the first year of study (1971). In Figure 4.6 one could see the some 
trend but now with respect to the previous year. Figure 4.6 definitely shows that the 
change in health care expenses has a nonlinear behavior. It is changing linearly during 
the first three years, and then at an accelerated, possible quadratic rate, from then on. 
The same trend seems to be shown in Figure 4.5, this trend is highlighted in the last 
column, where the mean of the relative differences jump from 48.91 to 76.50. 

In Figures 4.5 and 4.6 it is also possible to' identify those services areas that have 
the maximum and minimum relative change. As an example from 1971 to 1972 area 
number 8 had the maximum positive increase and from 1972 to 1973 the area with the 
maximum positive change was area number 4. 

C. AN ANALYSIS OF THE NEW YORK STOCK EXCHANGE 

In the previous analysis, the data considered consisted of the same type of 
commensurable values; i.e., dollars through a period of time (one could consider this as 
being multiple short time series data). In contrast with this type of data, the 
practitioner can encounter multivariate data that represent different qualitative and 
quantitative magnitudes. One example of this type is the data obtained from the stock 
markets in the United States. Here again, the practitioner can make use of the 
BOXPLOTTED tables as a tool for data analysis. The data to be analyzed was 
extracted from the New York Times, representing the most active stocks (measured by 
the number of shares traded) in the New York Stock Exchange for the week ended on 
August 8, 1986. The data is initially formatted as a two dimensional array consisting of 
40 rows (representing each of the different trading companies) and 6 columns. Each 
column correspond to the following variables. 

(1) Volume of shares trade.d during the week (in 100,000 units). 

(2) Closing price at the end of the week (in dollars). 

(3) Price change during the week (in percentage). 

(4) Price change during the last 12 months (in percentage). 

(5) Earnings per share during the last 12 months (in dollars). 

(6) Earnings per share during the last 12 months (in percentage). 
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Figure 4.5 Relative Differences in the Health Care Cost 
California Health Service Areas (First Year). 
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Figure 4.7 shows the initial distributional characteristics (in the form of boxplots) 
of each subsamplc data. One of the first visual messages from these plots arc the 
outliers in each column. Here is where the power of this new graphical technique lies: 
one can easily identify those possible outliers by looking at the tabular data adjacent to 
the boxplots; although this is easier in the first column since the data is ordered in that 
column. As an example, looking at the first boxplot and the first column, it is easy to 
identify Owen Corning and the Mobil Corp. as those companies traded by these two 
companies is greater that 2.7c of the average column traded. 




Figure 4.7 40 Most Active Stocks for the Week Ended August 8, 1986 

(New York Exchange). 

Another observation that can be made from this figure is the absence of 
statistical correlation among the variables when these are compared in the order shown 
in Figure 4.7. In this case, the sample serial rank correlation arc obtained by 
comparing the adjacent columns. Looking at the sample serial rank correlation, one 
could conclude that there is no statistical relationship between, as an example, the 
volume of shares traded and the price at which the share closed at the end of the w r eck. 
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Figure 4.S 40 Most Active Stocks for the Week Ended August 8,1986 
(New York Exchange) with Lines Connecting the Two Higher Stocks. 

The only positive indication is a relationship between percentage change during the last 
week (column 4) and percentage change in the last year (column 5). One can visually 
confirm this lack of correlation by identifying the maximum and minimum values of 
adjacent columns. For example, the Am Motor Co. shows to have the lowest volume 
of shares traded during that week but the LTV Corp. had the lowest close price. 

Notice that one is not only interested in the stock which is the most active during 
the week. One is also interested in which stock has the greatest (absolute or relative) 
change in price, and whether this is related to other factors like earnings (absolute or 
relative). With this is mind, it is possible to follow those stocks that have the largest 
value in each of the variables considered in the analysis. Figure 4.S, shows the two 
stocks which have this characteristic. These stocks are joined by line segments. It is 
easy to see that the Owens Corning Co. is the stock with the highest volume of shares 
traded during the week and also the largest price change (in percentage) during the last 
twelve months; also, the IBM Co. has the highest closed price at the end of the week 
and has the second largest earnings per share (in dollars) during the last year. Likewise, 
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the practitioner could follow the trend of those stocks with the lowest value in each of 
the variables considered, or even the mid-point values. 

Differences have no meaning here but it is interesting to trace the movement of 
the most active stocks (in volume) to other indicators (columns). 

D. AN ANALYSIS OF AUTOMOBILE DATA 

The purpose of the following analysis is to try to explore some important 
descriptive characteristics of different types of automobiles and an attempt is made to 
find any relation between these characteristics. As it is shown in this analysis, the 
STARPLOTS, the CODED SCATTER plots and the CODED DRAFTSMAN plots 
techniques are paramount experimental statistical tools in this type of analysis. It is 
appropriate to mention at this time that one other author has previously made use of 
the data treated here and has written an outstanding analysis (See [Ref. 4] ). The 
purpose here is to demonstrate how one can convey to the same general conclusion 
using these new techniques. The new technique is the enhancement of SCATTER and 
DRAFTSMAN plots by coding in other variables. 

The data represent three general categories of quantitative and qualitative 
characteristics of American and Foreign automobiles of 1979 (the data was obtained 
from the Consumer Report Review). These categories arc: performance, dimension and 
price. The variables under these categories are as follows. 

In category one : mileage in miles per gallon, repair records for 1977 and 1978 
(rated on a 5 points scale; 5 = best and 1 = worst), turning diameter (clearance required 
to make a U-turn) in feet, gear ratio for high gear. 

In the second category : headroom in inches, weight in pounds, length in inches, 
displacement in cubic inches. 

And under the last category: price in dollars. This data was initially formatted 
into a two dimensional array consisting of 74 rows (name of automobiles) and 13 
columns. Each column corresponds to each one of the variables mentioned above, and 
the last column correspond to an ordinal variable to denote American or Foreign car. 
This variable has been added to the original data to demonstrate one of the many 
possible application of the CODED SCATTER plot and of the CODED 
DRAFTSMAN plot introduced in this thesis; as an example, one can readily identify if 
a certain deviation from a possible pattern is due to American or Foreign cars. 
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Figure 4.9 Assignment of Variables to the Rays of the Star 
Automobile Data. 




Figure 4.10 STAR Plot of Automobile Data, 10 Lighter and 
the 10 Heavier Automobiles. 
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As an initial starting point for this analysis, one could study the characteristics of 
each one of the individual automobiles. The STAR plot technique was choosed for this 
purpose. Figure 4.9 shows the assignment of the twelve characteristics to the rays of 
the star. In the study of this type of data, it is interesting to highlight the favorable 
characteristics of each automobile. So as in Chambers [Ref. 1] the larger the ray of the 
star the more positive that attribute is to the respective automobile. To make price, 
turning diameter and gear ratio favorable, these variables were multiplied by -1 (i.e., 
the larger the ray corresponding to price, the less expensive the car is). The star is 
arranged in such a way that the statistics corresponding to cost and performance 
categories are rising upward and horizontally,- and those rays pointing downwards 
correspond to variables closely related to the dimension of the automobiles. Appendix 
C shows the complete STAR Plots for the 74 automobiles. 

The array of stars are ordered by the weight, the first and last stars 
corresponding to the lightest and heaviest automobiles respectively. Figure 4.10 
displays a summary of Appendix C, showing the 10 heaviest and 10 lightest 
automobiles. The idea behind this arrangement is, as commonly accepted, that weight 
is positively correlated to safety. Note the switch between the first (Honda Civic) and 
last (Lincoln Continental). For the first of these all positive values arc above the line; 
for the latter this is switched. Note too that the variable of greatest interest to 
Consumers Reports, Repair 78, is the vertical ray. 

In Figure 4.10, it is easy to see that nine out of the ten lighter cars, in the top 
panel, are of foreign make, the exception being the Ford Fiesta. Also, that the 10 
heavier cars are Americans. From the STAR plot of Figure 4.10, one could also 
compare other characteristics among these automobiles. As an example, in terms of 
price variable alone, notice where in this case, that among the 10 heavier cars there are 
4 American cars that are inexpensive compared with most of the lighter foreign cars 
(these American cars are the Mercury Cougar and Cougar XR-7, Buick Electra and, in 
lesser way the Oldsmobile 98). Also in terms of repair records (of 1977 and 1978), those 
American cars among the heavier ones compare with those foreign cars among the 
other group. The information is abundant in these plots. However, when there are 
many variables involved in the analysis it is questionable whether the practitioner can 
actually capture the behavior of one variable alone or the joint behavior of two or 
more variables. As in this case one would like to see if there is any relation (linear or 
other type) between price and weight or, say, displacement and price (it is difficult to 
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identity any deviation, if it exists, from a possible relation in the STAR plot). In this 
type of situation, once the practitioner has an initial impression of the data, it is now 
the time to make use of other exploratory data analysis technique, such as CODED 
SCATTER plots and CODED DRAFTSMAN plots. 

To continue the analysis it was desired to study any possible relation among the 
price, mileage per gallon, weight and displacement of the automobiles and to compare 
how American cars stand against the Foreign cars. The relations between these 
variables arc examined in Figure 4.11 by using a CODED DRAFTSMAN plot. It was 
expected to sec positive correlation between displacement and weight. This can easily 
be seen in the plot position 2,2 of Figure 4.11. The two possible outliers in plot 2,2 of 
Figure 4.11 show that there arc two American cars that stand favorably among all 
others. They arc lighter cars with high displacement. From the figures in Appendix C 
these two automobiles were identified as the Chevrolet Chevette and the Buick Opel. In 
terms of price, it is also possible to conclude from plot position 3,3 of Figure 4.11 that 
there is a negative relation, as expected, between price and weight. Notice, in the plot 
position 2,1 that there seems to be two types of subsamplcs within the data, one of 
foreign cars and the other of American cars (the foreign cars standing favorably against 
the American ones); however, both subpopulations have the same trend, namely, that 
weight increases with price. There arc a couple of interpretations of this plot, beside the 
obvious dichotomy between American and Foreign autos. One is that if you want a 
heavy car, you will have to pay more if you also want it foreign made. 

An expansion of Figure 4.11 is given by the CODED SCATTER plot, which 
have additional variables coded in as symbol type and size. Looking at price versus 
m.p.g., in the CODED SCATTER plot of Figure 4.12, one can confirm the idea that 
the higher the price of the automobile the less miles per gallon is expected. Notice, that 
with this figure, it is possible to analyze four variables at the same time : price and 
miles per gallon being the axis and weight and nationality the coded variables. It is 
interesting to notice that one American and one Foreign car tend to deviate from the 
norm. The American cars is the Cadillac Seville, with very high price, quite heavy, but 
a good relative mileage; the Foreign, being the VAV. Rabbit (Diesel) is at the opposite 
site of the spectrum. In the middle of the plot is a medium price, foreign car with very 
good mileage. This is a BMW 32()i. 
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Figure 4. 1 1 CODED DRAFTSMAN Plot of Automobile data 
A = American, F = Foreign. 




Figure 4.12 CODED SCATTER Plot of Automobile Data 
Price vs MPG (A = American, F= Foreign and Size = Weight). 
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E. AN ANALYSIS OF CONTRACT DATA 



The purpose of this analysis is to demonstrate other possible applications of the 
CODED SCATTER plot graphical technique as a tool in the exploratory data analysis. 
It is also appropriate to emphasize that the data considered in this section has been 
amply analyzed by other authors ( [Ref. 4] ) and again the purpose is only to highlight 
the use of this mentioned graphical technique. 

The data consisted of 177 contracts (rows), which were authorized by the 
Department of Defense during the period of 1949 through 1963. The columns consist 
of 1 1 variables of possible interest to the Department of Defense on how they have 
interfaced on a contractual level with the private sector of manufacturers. The data 
represent contracts let with 23 major contractors during this period, and includes 
information concerning 7 types of manufacturer products, ranging in complexity from 
drone aircrafts to missiles and helicopters. The 1 1 variables are listed below: 

(1) Deviation from target cost (percent). 

(2) Months to comply a contract. 

(3) Target profit of manufacturer (percent). 

(4) Sharing ratio (percent). 

(5) Ceiling price (percent of target price). 

(6) Target cost. 

(7) Number of items produced in the contract. 

(S) Number of contracts let that year. 

(9) Year the contract was signed. 

(10) Contractor awarded the contract. 

(11) Type of system. 

Due to the diversity of the data and the purpose of this section, it was decided to 
narrow the objective of the analysis to a single issue, which is probably the most 
important to the Department of Defense: an attempt will be made to see if there is any 
increase (or decrease) in the deviation from the manufacturer target cost through time. 
The one deviation that is considered to be the most significant will be the positive one, 
since this phenomenum would represent additional expenditure to the government. 
Thus, the task is to try to find a possible cause to this increase. 

Among the other 10 factors, it was hypothesized that the year in which the 
contract was signed and the time (in months) to complete the contract had significant 
influence on the deviation from the manufacturer original target cost. The variable year 
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signed was considered since the period of study includes an event that had significant 
impact on the US economy: the Korean War; therefore, it was expected that smaller 
contractors, not really prepared to react to the contingency of war production, would 
be less capable of making accurate predictions. Figure 4.13 shows the display of the 
year the contract was signed versus the deviation from target cost. It was also 
expected that the contractor, increasing from normal productions, would also be 
affected in their prediction capabilities. The hypothesis about the time to complete the 
contract is based on a simple idea : the wider the interval of time for which the 
prediction is made, the less is the probability of asserting the prediction. It was also 
desired to see if the major trend in this deviation of the major contracts, since these 
were probably of greatest interest to the government. In Figures 4.13 and 4.14 three 
major contractors were selected as been of relative importance: Lockeed, Douglas and 
Grumman. These three are coded by the initial letter. It is easy to see that the actual 
year that the contract was signed does not really influence the deviation from target 
cost; the deviation are evenly distributed across the period of interest. However, notice 
that during 1951 and 1952 (Korean War period) the deviation are mainly on the 
negative side (probably the significance of patriotism) and thereafter are evenly 
distributed. 




Figure 4.13 Year Signed vs Dev. From Target Cost, Contract 
Data (L= Lockeed, G = Grumman, D= Douglas, o= Others). 



The other variable of interest was then considered, namely the time to complete 
the contract. The range of this variable is from around 15 months to 130 months. 
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Fisure 4.14 Months to Complete vs Dev. Tar act Cost, Contract 
Data (L=Lockeed, G= Grumman, D= Douglas, o = Others). 

Figure 4.14 shows the plot of those contracts that took less than 40 months, 40 months 
and less than 70, and 70 or more months versus cost deviation respectively. It is clear 
that there is some form of positive relation between cost deviation and those contracts 
that took more than 70 months to be completed; confirming the initial hypothesis. 
There are some exception to this conclusion, and these are mainly contracts that were 
given to two of the largest contractors, Lockecd and Grumman, and possibly two 
smaller ones. 

These plots, in a clear way, demonstrate the care that must be taken in the 
analysis of single scatter plots, one scatter plot portrays only isolated relationship of 
two variables and may not indicate a casual relationship. One should make use of 
different exploratory data analysis techniques in an attempt to discover possible trends 
in the data being analyzed. 
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APPENDIX A 

COMPUTER PROGRAMS 



1. APLGRAFS EXEC. 

This exec program present a menu with all the programs available in the APL 
workspace APLGRAFS, after the selection is made the exec will load the necesary 
workspaces and will prompt the user to enter the name of the selected program. 



STRACE 
SET BLIP * 

-ONE 

CLRSCRN 

STRACE 

STYPE 

STYPE YOU HAVE THE FOLLOWING PROGRAMS TO USE 
STYPE 

STAR AND PROFILE PLOTS 
BOX PLOTTED TABLES 
SYMBOLIC SCATTER PLOTS 
DRAFTSMAN DISPLAY 
LOWESS 

EXPLANATION ON THESE FUNCTIONS 
QUIT 



( 1 ) 

( 2 ) 

( 3 ) 

m 

(5) 

( 6 ) 
(7) 



STYPE 
STYPE 
STYPE 
STYPE 
STYPE 
STYPE 
STYPE 
STYPE 

STYPE TYPE THE NUMBER CORRESPONDING TO THE PROGRAM YOU WANT 

SREAD VAR SOPT 

SIF SOPT = 7 SGOTO -FINAL 

SIF SOPT < 1 SGOTO -ERROR1 

SIF SOPT > 6 SGOTO -ERROR1 

* CP DEFINE STORAGE 2048K 

* SSTACK I CMS 

CP TERMINAL APL ON 
SSTACK )LOAD GRAFSTAT 



SIF SOPT 
SIF SOPT 
SIF SOPT 
SIF SOPT 
SIF SOPT 
SSTACK 
SSTACK 
SSTACK 
SSTACK 
SSTACK 
SSTACK 
SSTACK 
SSTACK 
SSTACK 
SSTACK 
SSTACK 
SSTACK 
SSTACK ' * 

SGOTO -SEVEN 

-TWO SIF SOPT > 2 SGOTO -THREE 
SSTACK 'NOW LOADING , DONT TOUCH YOUR KEYBOARD' 
)CQPY APLGRAFS GBOXPLOTAB GDEMO 
JPCOPY 990 CMSIO 



SGOTO -TWO 
SGOTO -THREE 
SGOTO -FOUR 
SGOTO -FIVE 
SGOTO -SIX 

'NOW LOADING , DONT TOUCH YOUR KEYBOARD* 
JPCOPY APLGRAFS GSTARPLOT GDEMO 
JPCOPY 990 CMSIO 

i i 

■FOR A DESCRIPTION OF THESE FUNCTIONS TYPE 

I I 

* INSTRUCT IONIQNS * 



TO EXECUTE THE FUNCTION STARPLOT TYPE 



STARPLOT 



SSTACK 

SSTACK 

SSTACK 

SSTACK 

SSTACK 

SSTACK 



'FOR A DESCRIPTION OF THESE FUNCTIONS TYPE 



INSTRUCT IONIONS 
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TO EXECUTE THE FUNCTION BOXPLOTAB TYPE 
BOXPLOTAB * 



&STACK 
&STACK 
&STACK 
&STACK 
&STACK 
&STACK ' ' 

&GOTO -SEVEN 

-THREE &IF &OPT > Z &GOTO -FOUR 

•NON LOADING , DONT TOUCH YOUR KEYBOARD 
)PCOPY APLGRAFS GSCATPLOT GDEMO 
JPCOPY 990 CMSIO 



&STACK 
&STACK 
&STACK 
&STACK 
&STACK 
&STACK 
&STACK 
&STACK 
&STACK 
&STACK 
&STACK 
&STACK 
&STACK ' 1 
&GOTO -SEVEN 

-FOUR &IF SOPT > 4 &GOTO -FIVE 

&STACK 'NOW LOADING , DONT TOUCH YOUR KEYBOARD' 
)PCOPY APLGRAFS GDRAFTSMAN GDEMO 
JPCOPY 990 CMSIO 



’FOR A DESCRIPTION OF THESE FUNCTIONS TYPE 



INSTRUCT IONIONS 



TO EXECUTE THE FUNCTION SCATPLOT TYPE 



SCATPLOT 



&STACK 
&STACK 
SSTACK 
&STACK 
&STACK 
&STACK 
&STACK 
&STACK 
&STACK 
&STACK 
&STACK 
&STACK ' ' 

&GOTO -SEVEN 

-FIVE &IF &OPT > 5 &GOTO -SIX 

&STACK ’NOW LOADING , DONT TOUCH YOUR KEYBOARD’ 
JPCOPY APLGRAFS GLOWESS GDEMO 
JPCOPY 990 CMSIO 



’FOR A DESCRIPTION OF THESE FUNCTIONS TYPE 



INST RUC TIONIONS 



TO EXECUTE THE FUNCTION DRAFTSMAN TYPE 



DRAFTSMAN 



&STACK 
&STACK 
&STACK 
&STACK 
&STACK 
&STACK 
&STACK 
&STACK 
&STACK 
&STACK 
&STACK 
&STACK ' ’ 

&GOTO -SEVEN 
-SIX CLRSCRN 
&CONTROL OFF 
&BEGTYPE -PAPA 

THIS WORKSPACE CONTAINS PROGRAMS THAT MAY BE USED AS EXPLORATORY 
DATA ANALYSIS TOOLS. PROGRAMS THAT ARE USED TOGETHER ARE 
CONTAINED IN GROUPS. 



’FOR A DESCRIPTION OF THESE FUNCTIONS TYPE 

i i 

’ INSTRUCT IONIONS ’ 

i i 
i i 

’ TO EXECUTE THE FUNCTION LOWESS TYPE : 1 

• i 

’ LOWESS ’ 



THE GROUPS CURRENTLY AVAILABLE ARE GDEMO , GSCATPLOT ,GBOXPLOTAB , 
GSTARPLOT, GDRAFTSMAN AND GLOWESS, WHERE THE G STANDS FOR GROUP. 

IF YOU HAVE COPIED THE WHOLE WORKSPACE APLGRAFS YOU CAN SEE A 
LIST OF THESE GROUPS AT ANY TIME BY DROPPING INTO APL AND TYPING : 



JGRPS 



GROUPS : 



GDEMO THIS GROUP CONTAIN SOME DATA SETS TO BE USED 

FOR ILLUSTRATION BY THE PROGRAMS IN THIS HS. 

GSCATPLOT THIS GROUP CONTAIN ALL OF THE PROGRAMS REQUIRED 



TO PRODUCE SYMBOLIC SCATTER PLOT OF THO OR MORE 
DIMENSIONAL DATA. A BASIC DISCUSSION OF THESE 
DISPLAYS IS CONTAINED IN 'GRAPHICAL METHODS FOR 
DATA ANALYSIS' By CHAMBERS (PAGE 157) . 

TO EXECUTE THIS PROGRAM TYPE : 

SC ATP LOT 

AND THEN ANSHER THE QUESTIONS. 

YOU MOULD NEED THE FOLLOHING THO DIMENSIONAL 
ARRAY : 

- ARRAY OF DATA ( IN APL INSIDE THE 

WS OR OUTSIDE AS A FORTRAN FILE) 

FOR A DEMO USE THE FOLLOHING ARRAY : 

DATA > CALHOS 

CALHOS CONSISTS OF COST PER PATIENT IN 1* GEO- 
GRAPHICAL DISTRICTS (ROMS) OF CALIFORNIA OVER 5 
YEARS (COLUMNS). 

GBOXPLOTAB THIS GROUP CONTAINS ALL OF THE PROGRAMS REQUIRED 

TO PRODUCE BOX PLOTTED TABLES (A COMBINATION OF 
BOX PLOTS AND A TABLE HITH THE ORIGINAL DATA ON 
THE SAME DISPLAY). TO EXECUTE THIS PROGRAM TYPE : 

BOXPLOTAB 

AND THEN ANSHER THE QUESTIONS. 

YOU MOULD NEED THE FOLLOHING THO DIMENSIONAL 
ARRAYS : 

- ARRAY OF DATA ( IN APL INSIDE THE 

HS OR OUTSIDE AS A FORTRAN FILE ). 

- ARRAY OF NAMES OF COLUMNS (AN ARRAY OF 

DIMENSION [NCOL,20D 

- ARRAY OF NAMES OF ROMS (AN ARRAY OF 

DIMENSION [ NROH >20 3 ) 

IF YOU DONT HAVE THE ARRAYS OF NAMES THE PROGRAM 
HILL ASK YOU TO ENTER THE NAMES ONE BY ONE. 

FOR A DEMO USE THE FOLLOHING ARRAYS : 

DATA > CALHOS 

ROM NAMES ---> CALHOSR 
COL NAMES — > CALHOSC 

GSTARPLOT THIS GROUP CONTAINS ALL OF THE PROGRAMS REQUIRED 

TO PRODUCE STAR AND PROFILE PLOTS OF THO OR MORE 
DIMENSIONAL DATA. A BASIC DISCUSSION OF THESE 
DISPLAYS IS CONTAINED IN 'GRAPHICAL METHODS FOR 
DATA ANALYSIS' BY CHAMBERS (PAGES 158-163) 

TO EXECUTE THIS PROGRAM TYPE : 

STARPLOT 

AND THEN ANSHER THE QUESTIONS. 

YOU MOULD NEED THE FOLLOHING THO DIMENSIONAL 
ARRAYS : 

- ARRAY OF DATA ( IN APL INSIDE THE 
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GDRAFTSMAN 



GLOWESS 



WS OR OUTSIDE AS A FORTRAN FILE) 

- ARRAY OF NAMES OF COLUMNS (AN ARRAY OF 

DIMENSION [NCOL,20]) 

- ARRAY OF NAMES OF ROWS (AN ARRAY OF 

DIMENSION [NROW,20l) 

IF YOU DONT HAVE THE ARRAYS OF NAMES THE PROGRAM 
WILL ASK YOU TO ENTER THE NAMES ONE BY ONE. 

FOR A DEMO USE THE FOLLOWING ARRAYS : 

DATA > CARS 

‘ ROW NAMES > CARSR 

COL NAMES — > CARSC 

CARS IS THE CAR REPAIR DATA GIVEN BY CHAMBERS, 

ET D. 

THIS GROUPS CONTAINS ALL OF THE PROGRAMS REQUIRED 

TO PRODUCE DRAFTSMAN DISPLAYS OF TWO OR THREE 
DIMENSIONAL DATA. A BASIC DISCUSSION OF THESE 
DISPLAYS IS CONTAINED IN 'GRAPHICAL METHODS FOR 
DATA ANALYSIS' BY CHAM3ERS (PAGES 136-140) 

DETAILED EXPLANATIONS OF THESE PROGRAMS ARE 
CONTAINED IN 'DRAFTSMAN DISPLAY •, A GRAPHICAL 
EXPLORATORY DATA ANALYSIS TECHNIQUE' AN NPS THESIS 
BY CAPT. MALCOLM JOHNSON, USA. 

THESE PROGRAMS ARE COMPLETELY INTERACTIVE AND CAN 
BE INITIATED BY TYPING : 

DRAFTSMAN 

AND THEN ANSWER THE QUESTIONS. 

YOU WOULD NEED THE FOLLOWING TWO DIMENSIONAL 
ARRAYS : 

- ARRAY OF DATA ( IN APL INSIDE THE 

WS OR OUTSIDE AS A FORTRAN FILE) 

- ARRAY OF NAMES OF COLUMNS (AN ARRAY OF 

DIMENSION ( NCOL , 20 ] ) 

- ARRAY OF NAMES OF ROWS (AN ARRAY OF 

DIMENSION i NROW ,201) 

IF YOU DONT HAVE THE ARRAYS OF NAMES THE PROGRAM 
WILL ASK YOU TO ENTER THE NAMES ONE BY ONE. 

FOR A DEMO USE THE FOLLOWING ARRAYS : 

DATA > CARS 

ROW NAMES — > CARSR 
COL NAMES > CARSC 

THIS GROUP CONTAIN ALL OF THE PROGRAMS REQUIRED TO 

USE THE ROBUST LOCALLY WEIGHTED REGRESSION SCATTER 
PLOT SMOOTHING TECHNIQUE DESCRIBED IN 'GRAPHICAL 
METHODS FOR DATA. ANALYSIS ' BY CHAMBERS (PAGE 121). 
DETAILED EXPLANATION OF THESE PROGRAMS IS 
PRESENTED IN 'LOCALLY WEIGHTED REGRESSION AND 
SCATTER PLOT SMOOTHING, A GRAPHICAL EXPLORATORY 
DATA ANALYSIS TECHNIQUE* AN NPS THESIS BY 
CDR GARY W MORAN, USN. 

THESE PROGRAMS ARE COMPLETELY INTERACTIVE AND 
CAN BE IMPLEMENTED BY TYPING : 

LOWESS 

AND THEN ANSWER THE QUESTIONS. 

YOU WOULD NEED THE FOLLOWING TWO DIMENSIONAL 
ARRAYS : 

- ARRAY OF DATA ( IN APL INSIDE THE 

WS OR OUTSIDE AS A FORTRAN FILE) 
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- ARRAY OF NAMES OF COLUMNS (AN ARRAY OF 

DIMENSION (NCOL >20 ] ) 

- ARRAY OF NAMES OF ROWS (AN ARRAY OF 
* DIMENSION [ NROW >20 ] ) 

IF YOU DONT HAVE THE ARRAYS OF NAMES THE PROGRAM 
WILL ASK YOU TO ENTER THE NAMES ONE BY ONE. 

FOR A DEMO USE THE FOLLOWING ARRAYS : 

DATA > CARS 

ROW NAMES — > CARSR 
COL NAMES > CARSC 

-PAPA 

&GOTO -ONE 
-SEVEN EXEC APLGST 
&EXIT 100 

-ERROR1 &TYPE YOUR VALUE HAS TO BE BETWEEN 1 AND 6 TRY AGAIN 
&GOTO -ONE 
-FINAL &EXIT 100 



2. APLCRAFS VSAPLWS 

The following is a description of the content of the APL workspace APLGRAFS 
VSAPLWS, which contains all the functions needed to use the programs described in 
this thesis. This workspace contains several groups , each groups is related to an 
especific program, and contains the functions required to execute that program. 
Following is a list of groups and functions inside those groups. 



Group GBOXPLOTAB 






Functions BOXP LOT AB 


ADMI 


BOXLINES 



Group GDRAFTSMAN 






Functions DRAFTSMAN 


DRASYM 


DRAFT 


REPEATCK 


MINMAX 


TRANSFORM 


JJITTER 


SUB 


ADMINS 


LABELS 


REGRES 


REGRES2 


LOWS 


YMAVS 


MOVS 


MMOVAV 


GARY 


GARY2 



Group 

Functions 


GLOWESS 

REPEATCK 

LOWS 

DATAINPUT 


LOWESS 

REGRES 


REGRES2 

PLOTQUERY 


Group 

Functions 


GSCATPLOT 

MINMAX 

SCATPLOT 


TRANSFORM 

ADMI 


JJITTER 


Group 

Functions 


GSTARPLOT 

TRANSFORM 


STARPLOT 


ADMI 
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3. APL PROGRAMS. 

This section contains the program listings of the APL programs written for this 
thesis, and the modified version of some existing APL programs taken from Johnson 
[Ref. 4] and from Moran [Ref. 5]. 

a. BOXPLOTTED tables (Program BOXPLOTAB) 



[0] 



[1] 
12] 
!3' 
'4 
[5] 
:6] 
\7~ 
[ 8 ] 
-.9] 

:io] 

:ii] 
[ 12 ] 
: 1 3 ] 

[14 
:i5 
: i6 

; 1 7 
: 1 8 ] 
:i9~ 
: 20 
[21 
’22] 
: 23 ~ 

[24] 
:2 5] 
[26] 
[27] 



BOXPLOTAB ; DAT AO ; DATA ; IPL ; NROW ; NNCOL ; NNNCOL ; NCOL ; PLO ; 
UIND ; YL ; S20 ; DIF ; LEE AD ; YN ; UIND 1 ; UIND2 ; ORD ; ORDEN ; SORT ; 
ORD 1 ; XL ; BAS1 ; COL2 0 ; XX ; MEA ; VAR ; MED ; DATA 2 ; SCRE ; TX ; LX ; 

S JZ ; YY ; CO A ; DAT AO 1 ; OPC1 ; BASO ; CONT ; CAM 1 
ADMI 

DATAO+DATA 
IPL+ 0 

(PPCC*'Y» )/0 
PPOMl-t- CAM 
(NROW<50 )/L01 

' THE MAXIMUM NUMBER OF ROWS ALLOWED IS 50 , TRY AGAIN ' 

0 

L0l:NNNCOL+NNCOL+ i+ CAMS 
C02 :PLO+UIND<-XL<-YL<-S20<-20 DIF* 1 ' 

' P/VMP MS’ SCREEN LABEL ' 

LHEAD * D 

'DO YOU HAVE A (NCOL 2 0 CCAPS ) MATRIX WITH THE NAMES 
OF COLUMNS Y/N?' 

YN* 1+3 

+(YN*'Y' )/L 03 

' PiVMP THE NAME OF THE MATRIX ' 

UIND MD 

UIND M, ( (NNCOL, 5 ) ' ' ), (UINDlZl 15] ) 

->£015 
T03 : _M0 

UIND1+UIND2+' ' 

£04: 1*1+1 

' ENTER THE LABEL FOR COLUMN NUMBER ' , 3> (I ) 

UIND1+UIND1 , ("20+(S20,D)) 

*(I<NNCOL)/L 04 
£015: M0 

' DO YOU HAVE A (NROW 1 5 CHARS ) AJAMJX WITH THE NAMES 



OF ROWS Y/N? 



[28] 

’29] 

: 3 o “ 
: 3 1 
; 3 2 ] 

[33] 

[34] 

[35] 

[36] 

[37] 
"i 3 8~ 
'39 

; 4 o ] 
; 4 1 ] 
; 4 2 
: 4 3 

i44 

: 4 5 ] 
; 4 6 
>7 

[48 

[49] 

[50] 

[51] 

[52] 



IW + Q 
* (YN* ' Y ' )/£014 
' ENTER THE NAME OF THE MATRIX ' 

UIND 2-HD 

UIND2<-UIND2 [ ; 15] 

->£0 5 5 

£014 :MM 1 

' ENTER THE LABEL FOR ROW NUMBER ' , 9 ( J ) 

UIND2+UIND2 , (15+(D,S20)) 

*(I<NROW )/£014 
UIND2* (NROW ,15) UIND2 
£055 :0RD* NROW 

' DOU YOU WANT THE DATA ORDERED BY THE FIRST COLUMN? Y/N ' 
*( 'N' =J0-«-l+E])/£O56 
+('Y'*YO)/L 055 
ORD*tyDATAO [ : 1 ] 

DATAO+DATAO LORD ; ] 

UIND2*UIND2LORDi2 
£0 56 : Ml 
JOR<-DATAO 
JORL ; 1]*0RD 
£057 :MJ+1 
JOP[j J3+7CAMO[sJ] 

*(I<NNCOL)/L 057 
£0 5 :* (NNCOL<,6 )/£ 06 
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[53] 

[54] 
"55] 
[56] 
"57] 

[58] 

[59] 
“60] 

61] 
62] 

63] 

64] 

[65] 

[ 66 ] 
[67“ 
[ 68 ] 
[69] 
[ 70 “ 

[71] 

[72] 

[73] 
[74 

[75] 

[76] 

[77] 

[78 
[79 
“80 

[81] 

[82] 

[83] 



2VN9PL0VLHEADVXL 



IPL+IPL + 1 
NCOL+ 6 

NNCOL+NNCOL - 6 

-*-50 7 

50 6: NCOL+NNCOL 
IPL+IPL + 1 
NNCOL+ 0 

L07 :DATA+DATAOL; (((IPL- 1) 6 )+NCOL)l 
JER+JORL: (((IPL-1) 6 )+NCOL)l 
CONT+,§(NRON ,(NCOL)) ((NCOL) + 1) 

DATA1+ ,§DATA 

BAS0+' rq50VCONTVDATA1V0VLVBOX;0 1 

-YL- ' 

BAS0+BAS0 , ' .16 .20.92 . 85W5I7V 1.87. 2VLINV1 1 0V 
0 10 0^' 
f?CW BAS 0 

C552O*-(M:05 + 1 ) 20 
XX«- C0L20 
1+1 

RHO+(NCOL- 1 ) 0 
LRHOl: I+I+l 

TIE++/( > JORliI-l)=JORlin ) 

+(TIE>(NRON 2))/LRH02 

RHOLI- l]^l-(6 (+/( ( e 70P[;J-l]- e 70i?[;J] )*2)) (WM 
((1VP0J7*2)-1))) 

+LRH03 

LRH02 :N1+NR0W ( ( (/7P5P/+1 ) 2 )*2 ) 

P50[J-l]«-(((+/(i70P[;5-l]*2))-7Vl)*O.5) <((+/<«70fl[;J] 
*2 ) ) -771 )*0 . 5 ) 

P50[J-1]^( (+/( e 70P[;J-l] J0P[;J] ))-/71)i?50[I-l] 

5P50 3 : ♦ ( J< (/VC55 ) U LRU 0 1 

5055^(55520,1) (( 20+S20 , 'RANK CORR . . - ' ), (20 4 g>555 ) , 



MEA+(COL20 ,1 ) (("20+520 
,1) (( 20+520 



7A5«-(55520 
MP4 ) ) 



1 MEAN .-') ,(20 U vMEAN DATA)) 

1 745J47V55 .-'),( 20 4 3> 74554555 



84] 

85] 
[ 86 ] 

[87] 

[ 88 ] 

[89] 

[90] 



[93] 

[94] 

[95] 

[96] 

[97] 

[98] 

[99] 

[ 100 ] 
[ 101 ] 
[ 102 “ 
[103 
[104 
[105 
[106] 
[107" 
[108 
[109 
[110 
[111 



M5M( 5052 0,1 ) 
MEDIAN DATA ) ) 



(("20+520 , ' MEDIAN (20 4 <15(50520 

DATA2+§ ( CORR, ME A, VAR , MED ) 

UIND+' OBSERVATION ' ,UIND1 [ ( ( IP5-1 ) 120 )+ (NCOL 20 )] 
UIND+(COL 20,1) UIND 
SCRE+ 0,0.85,0.98,0.9 

TX+1 

LX+ 0 
SJZ<-6 

-91- 5451-' --10-XX-5Y UIND: \SIZ---S2Q SCRE--LIN 0 ' 

-92- 5451 -5451 , ' 140 -LIN LX PXVl 0 0V0 1 0 0¥' 

YY+COL20 1 
RUN BAS 1 

SCRE+0 ,0.05,0.98,0.15 
PX+4 

I<-0 

SIZ-e-5 
MOO : J-f-J+1 
(J>4)/M001 

UIND+(COL20 ,1) 54P42[(5-J);] 

YY+COL 20 I 
RUN BAS 1 
7200 

M 001:1+0 
TX+NROW 
SI Z+2 
LX + 1 

SCRE+0 ,0 . 2 
M01 : J-e-l+1 

UIND+(COL 20,1) ( (UIND2 [ J ; ] ), (5 0 3>055[J] ), 

(20 2 3>54P4[I;] )) 

YY+COL20 ( (NROW+1 )-I) 

RUN BAS 1 



0.98,0.85 



[ 112 ] 

[113] 
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[114] 

[115] 

[116] 

[117] 

[118] 

[119] 

[ 120 ] 
[ 121 ] 
[ 122 ] 

[123] 

[124] 

[125] 

[126] 

[127] 

[128] 

[129] 

[130] 

[131] 

[132] 

[133] 

[134] 

[135] 

[136] 

[137] 

[138] 

[139] 

[140] 

[141] 

[142] 

[143] 

[144] 

[145] 

[146] 

[147] 

[148] 

[149] 

[150] 

[151] 



[152] 

[153] 



(I<NROW)/M 01 
(lO* ' Y 1 )/M02 
PAUSE 

' DO YOU WANT TO JOIN WITH LINES DATA POINTS OF THE 
SAME POSITION ' 

( ' Y'*1*Q)/M02 

BOX : ' ENTER THE POSITION OF THE DATA POINT ( ENTER 0 
TO FINISH ) ' 

(0=DP*0)/M02 

ZZ*JER LDP ; ] BOXLINES DATA 
BOX 

MO 2 '.PAUSE 
(NNCOL>0 )/L 05 
( ' Y'=l+PPF)/0 

TUMA : ' DO YOU WANT TO SEE THE DIFFERENCES BETWEEN 
COLUMNS Y/N?' 

('Y'*DIF* l*d)/0 
I PL* 0 

NNCOL*NNNCOL - 1 
COA* (NNCOL , 1 ) NNCOL 

' DO YOU WANT ABSOLUTE DIFFERENCES (A ) OR RELATIVE 
DIFF. (R)' 

( ' A ' *D IF1*1 -M3 ) /TUMA 1 

LHEAD* ' ABSOLUTE DIFFERENCES BETWEEN COLUMNS ' 

' THE DIFFER RELATIVE TO THE FIRST COLUMN (F) OR THE 
PREVIOUS (P) ?' 

( ' P' *1\DF*L]) /TUMA01 

DAT AO* (DAT AO [ ; NNCOL] -DAT AO [ ; (1 + NNCOL)] ) 

TUMA 2 

TUMA 01:DATA01*§( NNCOL , NROW ) DAT AO [ ; 1 ] 

DAT AO* I DATAOl-DATA [ ; ( 1+ NNCOL ) ] 

COA* (NNCOL , 1) 1 
TUMA 2 

TUMA 1 : LHEAD* ' RELATIVE DIFFERENCES BETWEEN COLUMNS ' 
' THE DIFFER. RELATIVE TO THE FIRST COLUMN (F) OR THE 
PREVIOUS (P) ? ' 

( 'P'xi + Vn/TUMAU 

DAT AO* I ((DATAOLi NNCOL1-DATAOL; (1+ NNCOL)] ) 

DATAOL ; NNCOL ] ) 100 
TUMA2 

TUMA11 : (NNCOL .NROW ) PAPAO [ : 1] 

DATAO*] ( (DATAOl-DATAOt; (1+ NNCOL)] ) DAT AO 1) 100 
CO A* (NNCOL , 1 ) 1 

TUMA 2 :AA*( NNCOL ,15) ' PJFF. BET.' 

UIND1*,AA,(2 0 s(COil)), ( (WiVCOL , 1 ) ) , (2 0 

5 ( (NNCOL, 1 ) ( 1+ NNCOL ) ) ) 

UIND2*UIND2 LORD ; ] 
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b. STAR plots and PROFILE plots (Program STARPLOT) 



[ 0 ] 



[ 1 ] 

[ 2 ] 

[3] 

[4] 

[5] 

[ 6 ] 

[7] 

[ 8 ] 
[9] 

[10 

[11 



STARPLOT ; PRCD ; ANS ; SCOL ; SROW ; NCOL ; NUP ; INC ; MAX ; SINT 
; COST ',M\BAS\I ;MIN ; SPA \ TC\PI \ONE\K\ CxPOSN \XAXIS ;P; 
PASO • SAS1 ’ N ’ X ’ XX ’ TYP 

JO : ' TYPE (S ) FOP SPAF PLOT OR (P) FOR PROFILE PLOT ' 
FYP-el + Q 

( (FYPx'S' ) a (PYPs 'P' ) )/J0 
APMI 

(PRCDx'Y' )/0 
ONE* 1*0 

NCOL* l+( PAPA) 

WPO//<-l+( DATA) 

' PO YOP FAFF A (FFOF 20 CPAPS ) MAPFIX WITH THE NAMES OF 
ROWS Y/N?' 

] ■*-('Y , *(l + D))/«700 

] ' ENTER THE NAME OF THE MATRIX OF NAMES ' 
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[ 12 ] N + 0 

[13] -*<701 

[14] <700 : J-cI+1 

1 5 ] ' ENTER THE NAME FOR RON NUMBER ' , < 5 > ( J ) 

[16] N+N t (20+(d, (20 * '))) 

[17] + {KNROW)/JOO 

[18] <701 : 1*0 

[19] ' DO YOU HAVE A { NCOL 2 0 CHARS ) MATRIX WITH THE NAMES OF 
COLUMNS Y/N?' 



. 20 ] 

! 21 ] 

[ 22 ] 

[23] 

!24] 

[25] 

[26] 

[27] 

[28] 

[29] 

[30" 

[31 

[32] 

[33] 

[34] 

[35] 

[36] 

[37] 

[38] 

[39] 

[40] 

[41] 

[42] 

[43] 

[44] 

[45] 

[46] 

[47] 

[48] 

[49] 

[50] 

[51] 

[52] 

[53] 

[54] 

[55] 

[56] 

[57] 

[58] 

[59] 

[60] 
[61] 
[62] 

[63] 

[64] 

[65] 

[ 66 ] 
[67 “ 
; 6 8 

; 6 9 ] 

[70] 

[71' 

[72 

[73 

[74 

[75 



-*( •Y»*(l+D))/«702 
> ENTER THE MATRIX WITH THE NAMES > 

NC + □ 

-*<703 

J02:I*I+! 

' ENTER THE NAME FOR COLUMN NUMBER ' , a> (I ) 

N+N, (20+CID, (20 ' ' ))) 

+{I<NCOL)/J 02 

<703 : ' DO YOU WANT ALL COLUMNS OF YOUR MATRIX OR 
SELECTED COL . ALL/SEL ? ' 

ANS+ 1+D 
+{ ANSI'S' ) /KOI 

' ENTER AS A VECTOR THE SELECTED COLUMNS ' 
DATA+DATA [ • SCOL*W\ 

KOI: 'DO YOU WANT ALL THE ROWS OF YOUR MATRIX OR 
SELECTED ROWS {ALL/SEL ) * 

ANS + If a 
+{ANS*'S' )/K0 2 

' ENTER AS A VECTOR THE SELECTED ROWS ' 

DATA+DATA ZSROW + □ ; ] 

KO 2 : TRANSFORM 
NCOL* 1+ ( DATA ) 

NROW+!* {DATA) 

CONI : ' ENTER NUMBER OF PLOTS PER SCREEN ( 3 4 OR 5 ) 
NUP*U 

*CON ((NUP>2)a(NUP<6)) 

' NUMBER~OF PLOTS~MUST BE 3 4 OR 5 , TRY AGAIN ' 

+CON 1 

CON :INC* 0.95 NUP 
ONE* {NCOL ,1 ) 1 
MAX*MIN*NCOL 0 
*{TYP= 'S' )/L 0 
XX* {NCOL ,!)X*{{! NCOL ) 

M+ClNCOL, 1) ' 

M*{ { 

-*LL 0 



. - x {{NCOL)-!)) x x 

. DL,1 ) 0 ) .XX, ONE , ONE , XX. ({NCOL. 1 ) 0 ) 

(2 NCOL), 3) M), [1] ((2,3 ) (1 ,XZNC0L1 , 0, 1,0,0 ) ) 



L0:SINT+lo{o2 {{NCOL)-!)) NCOL 
COST* 2o(o2 {{NCOL)-!)) NCOL 

M*ONE,{{NCOL, 1) (((0.8 COST)+ 1) 2)), {{NCOL, 1) 
(((0.8 SINT)+!) 2)) 

M*{{{2 NCOL) ,3) {M, {{NCOL, 3) (1 , 0 . 5 , 0 . 5 ) ) ) ) 

LLO :BAS* 1 nnl2W?3¥. 0 . 0 RPVRPWOFFV 1 VLINVLINV0FF9 ' 
RUN BAS 
ANG*I* 0 

-*( TYP-'P « )/LL01 

M*{{NCOL, 1) ((COSP+l) 2)), ((WCOL.l) ( (SIMF+1 ) 2)) 
-*L0 0 

LL0!:M*{{NCOL.!) (((1 NCOL) { { NCOL)-! ) ) )-0 . 02 ) , 
{{NCOL,!) 0.5) 

ANG * 90 
LOO :I*I+! 

NAM*NCZI ;] 

POSW-*(s(M[I;l] ,M[Jj2] )). « RP' 

BAS*' r*r\29NAM90VCQANGV6VN09N09' ,POSN , 'VOFFV' 

RUN BAS 

mNin+DATAZ!±&DATAZ-, J] ;J] 

M71X[J]^r^[( 1 )*&DATA [ ; J] ; J] 

*{I<NCOL)/L 00 

SP4^ • • 
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:76] I*-0 

: 77] TC+-NUP 

: 7 8 ] LOOP ': 3 : TC+TC+NUP 

: 79 ] PI*-0 .05, ( 1- (INC- {INC 6 ) ) ) , (0 . 05 + {INC- (.INC 6 ) ) ) , 1 
!80] Rf 0 

!81] LOOP2:R+R+l 
!82] C*~0 

!83] L00P1:C*-C+1 
:84] I-e-I+1 

!85] POSN<-PI+ ( {INC , (.-INC ) ,INC , (-INC)) ( (C-l ) , (B-l ) , (C-l ) , 
(R- 1))) 

!86 ] XAXIS+NLI;1 

!87] P+(DATALI:1 -MIN) (MAX-MIN) 

[88] ->(PYP= 'S' )/LO 00 

!8 9] M+(l ,0,0 ), [1] (ONE, XX, P), [1] ((3,3) (1 ,XlNCOL3 ,0,1, 0,0)) 
!90] +MOOO 

[91] LOOO:M+ONE, ((NCOL,l) (((PCOST) + 1) 2 ) ) , ( (NCOL , 1 ) 
(((PSINT) + 1) 2)) 

!92] M+M, [1] ( ( (2 NCOL) .3) (M , ( (NCOL, 3 ) (1,0. 5,0. 5)))) 

!93] M0OO:BASO*'nnl2tfM9l9.O . 0 RPVRPV0FFVP0SN9LIN9LIN90FFV ' 
!94] BAS1«- , rr29XAXJS9O9C9O969Y£SWV09.5 0.07 RP^ONV' 

!9 5] BiWBASO 

!96] P£W BAS1 

!97] +(I>NROW)/ENDO 

! 9 8 ] -> ( ( PC+C ) >NROW ) /END 

!99] +((C<NUP ) A ( (TC+C)<NROW) )/LOOPl 

HOOD (R<NUP ) / LOOP2 

:i01] ENDI PAUSE 

!102] ( (PC+C )<NROW ) / LOOP3 

:i03] ENDO: PAUSE 



c. CODED SCATTER plots (Program SCATPLOT) 



[0 ] SCATPLOT ; QUE1:CX ; CY ; I ; DATA1 ; LEE AD ; LPLOT ; WC0L ; LABX 
; LABY ; PXX ; PXY ; P0SN:P0SI ; EXPRE ; SIM ; COL ; SYZ ; DESCRI ; 
POSLEG ; A 1 ; X ; Y ; SPA ; X ; Y ; PL0P1 ; PLOPO ; PLOTLEG 
Cl] POSI+0 

[2] A DM I 

[3] DATA1+DATA 

[4] +END (PRCDx'Y') 

[5] +ONE (DIM-2 ) 

[6 ] ' YOUR DATA IS NOT A TWO DIMENSIONAL ARRAY , SCATPLOT BEING 
TERMINATED ' 



[7] 

[ 8 ] 
[9] 
[10 
[11 
[12 
[13 
[14 
[15 
[16 
[17 
[18 
[19 
[20 
[21 



[ 22 ] 

[23] 

[24] 

[25] 

[26] 

[27] 

[28] 



' PLEASE REFORMAT YOUR DATA AND START AGAIN ' 

+END 

ONE : DATA+DATA1 
SPA + ' 

NCOL+~l + (DATA) 

(POSI*0 )/ONE 00 
' ENTER THE SCREEN HEADER ' 

LHEAD+fl 

ONE 0 0 : ' ENTER THE PLOT HEADER ' 

LPLOT ’«-□ 

' ENTER THE VARIABLE (COLUMN ) FOR THE X AXIS ' 

X+DATAl\CX<r □] 

' ENTER THE LABEL FOR THE X AXIS ' 

LABX+0 

' DO YOU WANT ALL THE VALUES OF X OR JUST A SUBSAMPLE 
OF IT (ALL/ SUB)' 

QUE 1*1*D 
TWO (QUE1='A') 

' ENTER AN APL EXPRESSION WITH THE RANGE OF VALUES FOR X ' 
'E.G. (DATA [; ' , UCX), '] £500 )a (DATA [; ' , (<3>CX), ']£1000)' 
EXX+-0 

DATA+EXX/DATA 

TWO : ' ENTER THE VARIABLE (COLUMN ) FOR THE Y AXIS ' 
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[29] 

[30] 

[31] 

[32] 

[33] 

[34] 

[35] 

[36] 

[37] 

[38] 

[39] 

[40] 

[41] 

[42] 

[43] 

[44] 

[45] 

[46] 

[47] 

[48] 

[49] 

[50] 

[51] 

[52] 

[53] 

[54] 

[55] 

[56] 

[57] 

[58] 

[59] 

[60] 

[61] 

[62] 

[63] 

[64] 

[65] 

[ 66 ] 

[67] 

[ 68 ] 

[69] 

[70] 

[71] 

[72] 

[73] 

[74] 

[75] 

[76] 

[77] 

[78] 

[79] 

[80] 
[81] 
[82] 

[83] 

[84] 

[85] 

[ 86 ] 

[87] 

[ 88 ] 



Y+DATAliCY+Ul 

' ENTER THE LABEL FOR THE Y AXIS ' 

LABY + Q 

' DO YOU WANT ALL THE VALUES OF Y OR JUST A SUBSAMPLE 
OF IT {ALL/ SUB ) 1 
QUE 1-e-l-t-Q 
TWO 1 {QUE1='A 1 ) 

' ENTER AN APL EXPRESSION NITH THE RANGE OF VALUES FOR Y ' 
'E.G. {DATA [ ; ' , (<J>CY ) , 1 ] £500 ) a {DATA [ ; ' , (3>CY), *]£1000) 
EXY + □ 

DATA+EXY/DATA 
Th/01 iY+DATAZiCYl 
X+DATA [ ; CXI 
J JITTER 
TRANSFORM 
MIN MAX 
1+ 0 



'ENTER THE POSITION FOR THE PLOT E.G. 1 21 22... ' 
POSI+POSN + □ 

LOOP 1 {P0SI>1 ) 

POSN+ 0.10.10.80.8 
LOOPli I+I+l 
□^3 0 ($7) 

i i 

' ENTER IN AN APL EXPRESSION FOR THIS CATEGORY ' 

' I.E . (MM[;4]<.5)a(MM[;8] = 5) ' 

1 USE DATA AS THE NAME OF YOUR VECTOR ' 

EXPRE+a 

' ENTER THE SYMBOL ' 

SYM^Q 

• ENTER THE COLOR , I.E. BLUE ' 

COL+Vl 

1 ENTER THE SIZE , AS A NUMBER BETWEEN 1 {SMALL ) AND 
12 {BIG) 1 
sYZ+a 

SYMBOLS+SYM, 1 ; ' ,COL, ' ; ' ,SYZ 
FOUR (7=1) 

PLOT1 + ' p p 1 OVXWYVzEXPREV 1 , SYMBOLS , 1 ^SPA^SPA^SPA^SPA ' 
PLOT l<rP LOT 1 , ' VPOSiVVVPVPVl 0 OVO 10 0V 
RUN PL0T1 
FIVE 

FOUR-.PLOTO+ 1 nnlOVXVYVzEXPREV 1 , SYMBOLS , ' VLPLOTVLHEADW 
LABX ' 

PLOTO+PLOTO , ' 9LABY9P0SN99LIN LX TXVLIN LY 7YV1 1 1 
VO 1 0 OV 
RUN PLOTO 

FIVEi+SIX {P0SI>1 ) 

' ENTER A LABEL {DESCRIPTION) FOR THIS CATEGORY {MAX 2 5 
CHARS . ) 1 
DESCRI+IS+R, 1 1 

DESCRI+{*I) ' + 1 .DESCRI ,SYM , ' ' , {vSY Z) 

POSLEG+ 0.8, (0.75- (7 5 ) 100 ) 

PLOTLEG+ 1 fi fi 2^DESCRI ; 1 ,COL , 1 VOVLVOV3VYESWNOVPOSLEG RS 
VONV ' 

RUN PLOTLEG 

SIX : 1 DO YOU WANT ANOTHER CATEGORY {YES /NO ) ' 

QUE1<-1 't'Q 

LOOP 1 {QUE1- 1 Y 1 ) 

{POSI>l)/QUE01 

PAUSE 

END 

QUEQ1: 1 DO YOU WANT ANOTHER PLOT {YES /NO ) ' 

QUEl+U-B 
{QUEl-'Y ' )/ONE 
PAUSE 
END: 



i 
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d. CODED DRAFTSMAN plots (Program DRAFTSMAN) 



[0] DRAFTSMAN ; NCOL -,PI ;R;C -,Y ;TN T2N ; XAXIS ; YAXIS -,X;LX-,TX;LY 
TY -,ANS ;F;R0B ; 71 ;X1 -,YS ;M;NUM-,PRCD -,DIM;YM;XM-,UM-,SKP 
ADMINS 

SPA+SYMBOLS+LPLOT+LHEAD+XAXIS+YAXIS+ ' ' 

SYMBOLS ' ° • 

EXP+ ' A ' 

+LP 1 (PRCD='Y ' ) 

•>0 

LPH+LP2 (DIM> 3) 

+ (LP2.LP3 ,DP4)[DJM] 

LP2 : ' YODP DA 271 SET JS WOP A TWO OR THREE DIMENSIONAL 
ARRAY 1 

[10]' DRAFTSMAN IS BEING TERMINATED . PLEASES REFORMAT YOUR 
DATA AND ' 

' REINITIATE DRAFTSMAN ' 

0 

LPU:N DRAFT DATA 
0 

LP3 t 

NCOL<-~ 1+ ( DATA ) 

J JITTER 
TRANSFORM 
GARY 

'DO YOU WANT A SYMBOLIC DRAFTSMAN (YES /NO ) ' 

QUE1+1+E 
CONI (QUElx'Y') 

XX+DATA 

NCOL+DRASYM DATA 
LHEAD +' ' 

LPL0T + ' ' 

' YOU HA VE NOW ' . ( vNCOL )_, ' BASIC VARIABLES TO PLOT ' 

CON 1 : ' ENTER NUMBER OF PLOTS PER SCREEN ( 3 4 OR 5 ) ' 

NUP+U 

CON ( (NUP>2 )a (NUP<6 ) ) 

' NUMBER OF PLOTS MUST BE 3 4 OR 5 , TRY AGAIN ' 

CD /VI 

CON :TR+-NUP 

INC+0 ,95~NUP 
LOOPU : TR-FTR+NUP 
TC+-NUP 

LOOPS -.TC+TC+NUP 

WI<- 0 . 05 , ( 1 - (INC- (INC 6 ) ) ) , (0 . 0 5 + (IWC- (INC 6 ) ) ),1 
R<- 0 

LOOP2 '.R+R+l 

c*-o 

Y-e-DATA [ ; ( TR+R )] 

LOOPl : C+C+l 
X+DATA [; (PC+C)] 

( (TR+R)= (TC+C ))/ SKIP 

(INC , ( -INC ) ,INC , (-INC ) ) ( (C-l ) , (R-l ) , (C-l ) , 



Cl] 

[ 2 ] 

C3] 

[4‘ 

[5 

[ 6 ] 

[7] 

[ 8 ] 
[9] 



[ 11 ] 

[ 12 ] 

[13' 

[14 

[15] 

[16] 
~17~ 

18 

19 

[20 

‘ 21 ] 

22 ' 

[23] 

'24' 

[25] 

"26' 

[27] 

'28] 

29] 

30] 

31] 
32' 
33] 

[34" 

'35 

36 

37] 

[38" 

'39 

40 

41] 

[42] 

'43' 

44 

[45] 

[46] 



[47] 

[48" 

[49] 

[50' 

[51] 

[52' 

[53] 

[54" 

[55 

[56] 

[57" 

[58] 

[59" 

[60 

[61 



TC+C}','] 

TR+R ) : ] 

R=NUP)y ( (TR+R ) =NC0L ) ) ) /GRAPH 



] 



P0SN+WI+ 

(R- 1))) 

XAXIS+Nl 
YAXIS+Nf 

((c= i W 

XAXIS +' ' 

(C=l )/ GRAPH 
XAXIS<-N( (TC+C ) 

YAXIS +' ' 

( (R=NUP ) v ( (TR+R )-NCOL ) )/ GRAPH 
XAXI3+YAXIS+' ' 

GRAPH : M INMAX 
(ANSx'Y' }/FIN 
(SMT='M' )/M0V 
X LOWS Y 

SMOOTH + ' q 4 tfX9Y ; YS90 1919 . VSPAVSPAVXAXISVYAXISVPOSN ' 
SMOOTH+SMOOTH , ' 9DJW LX TXVLIN LY PY9 1 1 1910 11 0 0 ' 
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[62] 

[63] 

[64] 
[65" 
[66 
[67 
[68 
[69 

[70] 

[71] 

[72] 

[73] 

[74] 

[75] 

[76] 

[77] 

[78] 

[79] 

[80] 
[81] 
[82] 
[83] 



RUN SMOOTH 
SKIP 

MOViMMOVS YL&X1 
YM+UM 

MMOVSX CAX] 

XM<-UM 

SMOOTHIE ' P4VX ; XM*?Y ; YMVO l¥l^ . VSPAVSPAVXAXISVYAXISVPOSNV ' 
SMOOTH 1 <r SMOOTH 1 , ' LIN LX TX^LIN LY TYV1 11910 110 0' 

RUN SMOOTH 1 
SKIP 

FIW:B4S«-' Afi 10 ,EXP, *¥' , SYMBOLS , 'VLPLOTVLHEADV 
XAXISV ' 

BAS+BAS , 1 7AXIS¥P0SWW£-ZW LX TX^LIN LY TY^l 1 1 
^10 110 09' 



RUN BAS 
SKIP:*( ( ( TR+R)>(NCOL)\ 
( ( C<NUP)a((TC+c)<NCOL , 
( (R<NUP)a((TR+R)<NCOL. 
END:+(ANS='Y' )/SKIP 1 
J7J C71BY2 PWC 
SBIP1:P7LFS£ 

BBASB 

TC+C)<NCOL)/LOOP 3 
FB+B )<NCOL)/LOOP 4 



a ((TC+C)>NCOL))/END 
)/LOOP 1 
)/LOOP 2 
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e. Suporting Sub-programs 



FUNTION ADMI 



[0 
[1 
[ 2 ] 
[3~ 
[4 
[5 
[6 

[7] 

[ 8 ] 
[91 
[ 10 ] 
[ 11 ] 
[ 12 ] 

[13] 

[14] 

[15] 

[16] 

[17] 

[18] 

[19] 

[ 20 ] 
[ 21 ] 
[ 22 ] 

[23] 

[24] 

[25] 

[26] 

[27] 

[28] 

[29] 

[30] 

[31] 

[32] 

[33] 

[34] 

[35] 

[36] 



ADMI-,QR1-,QR2 

FUNCTION ADMI CALLED BY FUNCTION SCATPLOT ' USES 
FUNCTION CMSREAD , THIS FUNCTION IS A MODIFIED 
VERSION OF THE FUNCTION ADMINS FROM DTNLFNS VSAPLNS . 



A 
A 
A 
A 

PRCD+ ' Y ' 

' IS YOUR DATA SET LOCATED IN THIS WORKSPACE? (YES /NO ) ' 
OBl^ltQ 
PI (QBl* » Y 1 ) 

GO 

LP 1 : ' JS JOPB PAPA SBP LOCATED : ' 

' ( 1 ) IN AN APL WORKSPACE LOCATED ON THIS DISK OR ON A DISK ' 
' THAT YOU ARE LINDED TO ' 

* ( 2 ) IN A CMS FILE ON THIS DISK OR ON A DISK THAT YOU ARE ' 

' LINKED TO ' 

' ( 3 ) NIETHER ( 1 ) OR ( 2 ) 4B0FE ' 

' ENTER (1,2 OR 3)' 

QR2+Q 

(LP 2 . LP 3, LPU ) [QB2] 

LP 2 : ' TO TRANSFER YOUR DATA TO THIS WORKSPACE : 1 
( 1 ) TYPE . . . )PCOPY (WS NAME ) (DATA SET NAME ) ' 

EXAMPLE : )PCOPY DTNLDATA CARS ' 

! 

DATE AND TIME SAVED INFORMATION IS DISPLAYED ' 

WHEN THE TRANSFER IS COMPLETE . THEN ENTER GO ' 

TO PROCED WITH SCATPLOT ' 

SLADMI+GO 

GO : • ENTER THE NAME OF YOUR DATA SET ' 

DATA * □ 

DIM<- DATA 
END 

LP 3 : ' TO TRANSFER YOUR CMS DATA FILE TO THIS WORKSPACE ' 

' ANSWER THE FOLLOWING QUESTIONS ABOUT YOUR DATA SET • 
DATA+CMSREAD 
DIM* DATA 
END 
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: 3 7 ] LP 4 : ' YOUR DATA SET MUST BE STORED IN AN APL WORKSPACE OR ' 
!38] ' IN A CMS FILE LOCATED ON THIS DISK OR ON A DISK TO WHICH ' 

! 3 9 3 1 YOU ARE LINKED . SCATPLOT IS BEING TERMINATED . PLEASE ' 

!40] ' COMPLY WITH CONDITIONS ( 1 ) OR ( 2 ) AND REINITIATE SCATPLOT ' 
!41] PRCD*'N' 

!42] END: 



FUNTION BOXLINES 

CO 
Cl 
[2 
[3 
"4 
[5 
[6 
[7 

(11 140 ) ) ) 

[8] JM*( ( MX) JM),MX,( ( MX) JOR) 

[9] BAS2*' OP12VJW3V. 0 . 0 RPVRPVONV 0 .20 .98 .85 VLINVLINVOFFV ' 

[10] RUN BAS 2 

[11] END: DAT *- 0 



DAT*JQR BOXLINES DATA 
NCOL* 1 * DATA 
NROW* It DATA 
MX*§( 1+ (NCOL-1 ) ) 

MX+1 .MX* , $ (2 , MX) (MX, MX) 

JM*(( It MX) ,1 ) (0 1 ) 

, ( 1 - ( ( X. (NROW-1 ) ) JOR* JOR [MX] - 1 ) ) 

MX*((NR0W ( 1 * MX ) ) , 1 ) ((MX (2 14 )) + (! + MX) ((21 140) 



FUNTION DRASYM 



[ 0 ] 

[ 1 ] 

C2] 

[3] 
'4 
5 
6 ] 
[7] 
C8] 
'9] 

io: 

n: 

12; 

13: 

14 

15 
16; 

17 

18 

19 

20 
21 
22 

23 

24 

25 

26 

27 

28 

29 

30 

31 

32 

33 



(XX[;J]>100 )a(XX[;«7]=400 ) ' 



NCOL+DRASYM MATRIX ; Cl ; CV ; I ; SYM ; COL ; SYZ :ANS 
' ENTER AS A VECTOR THE VARIABLES (COLUMNS) THAT YOU 
WHISH TO HAVE ' 

' IN THE X AND Y AXIS (THE FIRST AND SECOND DIMENSION 
FOR THE PLOT ) ' 

Cl* □ 

N*NL(CI):1 
DATA*MATRIX [ ; CI1 
NCOL* Cl 
1*0 

EXP*SYM*COL*SYZ * ' ' 

' NEXT, YOU HAVE TO ENTER APL EXPRESSION FOR EACH 
CATEGORY (CODE)' 

' USE XX AS THE NAME OF YOUR ARRAY ' 

1 1 

' I.E. 

t 1 

' WHERE I AND J REPRESENT COLUMN NUMBERS BETWEEN 1 
AND ' , (<5 Cl) 

' BE CAREFULLY NOT TO OVERLAP VALUES ' 

1 1 

' WHEN THE PROGRAM ASK FOR SYMBOLS TYPE ANY (ONE ) 
CHARACTER' 

' FOR COLORS TYPE THE NAME OF THE COLOR I.E. BLUE OR RED ' 

' WITH SIZES 1 REPRESENT SMALL AND 1 2 BIG ' 

1 1 

LOOP1 : 1*1+1 

' ENTER THE APL EXPRESSION FOR THE CATEGORY (CODE ) 
NUMBER ',(91) 

EXP*EXP , ' ; ' □ 

' ENTER THE SYMBOL ' 

SYM*SYM , □ 

' ENTER THE COLOR ' 

COL*COL , ' , ' ,□ 

' ENTER THE SIZE ' 

SYZ*SYZ 1 1 □ 

' DO YOU WHISH ANOTHER CATEGORY ( YES /NO ) ' 

ANS * ltd 

LOOP 1 (ANS-'Y') 

EXP*2\EXP 
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[34] SYMBOLS* (1+SYM) 



' ; ' , (2+00ff ) 



' ; » , (2+SYZ) 



FUNTION DRAFT 



[ 0 ] N DRAFT M ; DATA ; WOOL ; TR \TC \PI \R\C ; Y ; ZW ; Y2A^ ; X4XJS ; YAXIS ; 
LX ; TX\LY \TY \ ANS ; F ; ROB ; Y1 ; XI ; YS ; M ; NPAG; VAR ; MORE ; XU ; 
X;POSN;YU 

[ 1 ] R R fl DO ffOZ 7 MOVE OR ERASE ; CMFSMr FUNCTION HEADER 
“ “ R n fl CMFSMr WILL NOT ADD A LINE TO THIS FUNCTION 
WITHOUT THIS HEADER 

' THE raffffff DIMENSIONAL DRAFTSMAN DISPLAY IS BUILT 
ONE VARIABLE AT A' 

' TIME . PROGRAM WILL ASK YOU WHICH VARIABLE YOU 

WANT TO LOOK AT' 

' EACH TIME IT IS READY FOR A NEW ONE . THE DISPLAY 
PRESENTED FOR EACH ' 

' VARIABLE REPRESENTS THAT VARIABLE PLOTTED AGAINST 
ALL OTHER ' 

' VARIABLES PAGE BY PAGE . THAT IS , THE FIRST ROW 
REPRESENTS THE FIRST ' 

' PAGE OF DATA . THE SECOND ROW REPRESENTS THE SECOND 
PAGE AND SO ON' 

DATA*M 
SPA * ' ' 

NCOL* 2+ ( DATA ) 

NPAG* 1 ♦ ( DATA ) 

LOOP 5 : ' WHAT VARIABLE DO YOU WANT TO LOOK AT? ' 

((^5 (NCOL, 1) NCOL ), [2 ] (^5 (NCOL ,1 ) ' ' )),[2] N 

VAR*U 

XU*XU+0.1 XU*r /[ /DATALi ; (VAR)) 

( *N [ ( VAR );]),' WILL BE PLOTED AS THE INDEPENDENT (X 
VARIABLE ) ' 

' AND ALL OTHERS WILL BE PLOTTED AS DEPENDENT ( Y 
VARIABLES). ' 

HITTER 
TRANSFORM 

GARY 

CON : ' ENTER * OF PLOTS PER SCREEN ( 3 , 4 OR 5 ) ' 

NUP* 1+D 

( (NUP< 3 ) v (NUP>5 ) )\CON 
INC* 0.9 5 NUP 

PI* Q.05, (l-(IJVC+(IiVC6))), (0.05 + (INC- (INC 6))),1 
TR* NUP 

LOOPn:TR*TR+NUP 
TC* NUP 

LOOP3 :TC*TC+NUP 
R* 0 

LOOP2 :R*R+1 
C* 0 

X-J -DATA [ (TR+R ) ; ; VAR ] 

DOOP1 : 0*0+1 

Y+ZMraUra+ff); ; (TC+C)) 

YU*YU+0.1 YU*[ /[ /DATAI-, ; (rc+C)] 

( (7AJ?)=(Z , C+C) )/SKIP 

POSN<rPI+ ( {INC , ( INC) , INC, (INC)) ( (0-1 ) , (ff-1 ) , (C-l ) , 
(ff-1))) . 

~ z’c+Oj] 



[ 2 ] 

[3] 

[4] 

[5] 

[ 6 ] 

[7] 

[ 8 ] 

9] 

10 ] 
11 ] 
12 

13] 

14] 
15 
16] 
17 

[18] 

[19] 

[ 20 ] 
[ 21 ] 
[22 

[23] 

[24] 

[25] 

[26] 

[27] 

[28] 

[29] 

[30] 

[31] 

[32] 
[33 

[34] 

[35] 

[36] 

[37] 

[38] 
[39 



[40] 

[41] 

[42] 

[43] 
’44" 
[45] 
[46 ~ 
[47 
[48 
[49 
[50 



\VAR)% V 
- —p)y 



XAXIS*Nl 

YAXIS*NL 

( (0=1 )a ( (ff=ff£/p)v ( (TR+R )-NPAG) ) ) /GRAPH 
XAXIS* ' ' 

(0=1 ), /GRAPH 
XAXIS*N [ ( TC+C ) ; ] 

YAXIS* ' ' 

((R= NUP ) v ( ( TR +R ) = N PA G))/ GRAPH 
XAXI S*Y AXI S* ' ' 

GRAPH: MI NM AX 
(ANS*'Y' ) /FIN 
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[513 (SMT- ' M ' )/MOV 
[523 X LOWS Y 

[533 SM00TH3+' fl49X9Y ;YS90 1919 . VSPAVSPAVXAXISVYAXISVPOSNV 
LIN LX XP9 ' 

543 SM00TH3<-SM00TH3 , ' LIN LY 1091 1 1910 11 0 O' 

5 53 RUN SMOOTH 3 
56 3 SKIP 

5 7 3 MOV: M MMO VAV Y[&X3 
583 YM+YMAV 

593 M MMOVAV XUX3 
603 XM+YMAV 

613 SM0OFP13«-'o49X;XM9Y;YM9O 191 V .VSPA9SPAVXAXISVYAXISVP0SN 
9010 LX X09 ' 

[623 SMOOTH13+SMOOTH13 , ' LIN LY YUVl 1 1910 11 0 O' 

[6 3 3 RUN SMOOTH 13 
[643 SKIP 

[653 FIN : BASIC3 + ' r 49X9790919 . ’VSPA’&SPA’&XAXI SWY AXI S’&POSN’V 
LIN LX X09 ' 

663 BASIC3+BASIC3 , ' LIN LY Y091 1 1910 11 0 O' 

6 73 RUN BASIC3 

683 SKIP:+(( (TR+R)> (NPAG ) )a ( (TC+C )>NCOL ) ) /END 

693 ( (C<NUP}a( (TC+C)<NCOL) )/LOOP l 

703 ( (R<NUP )a ( (TR+R)<NPAG))/LOOP 2 

713 END :+(ANS= ' Y ' )/SKIP 1 

723 GARY 2 

73 3 SXIPlrPAOSP 

743 P04SP 

[7 5 3 ((TC+C)<NC0L)/L00P3 
'763 ( (TR+R)<NPAG)/LOOPU- 

[773 'DO YOU WANT TO LOOK AT ANOTHER VARIABLE? ' 

7 83 MORE+l + \n 

[793 L00P5 ( MORE-'Y ') 



FUNTION GRAPHER 



[03 C04P0P0;C01;002;C03 ;ANS3;YS;Y1;X1;X;Y;ANS3;PRCD; 

DIM ; 0 ; REG 

[13 flflflCO POP MOyP OR ERASE; GRAF ST AT FUNCTION HEADER 
[23 fi fl o COMFSPAP WILL NOT ADD A LINE TO THIS FUNCTION 
WITHOUT THIS HEADER 
[3 3 ADMINS 
[43 NNN+N 

[53 +LP 1 (PRCD= ' Y ' ) 

[63 -*0 

[73 0P1 :+LP2 (DIM> 3) 

[83 +UP2.LP3 ,LPUKDIM1 

[93 LP 2 : ' YOUR DATA SET IS NOT A TWO OR THREE DIMENSIONAL 
ARRAY ' 

[103 ' GRAPHER IS BEING TERMINATED . PLEASE REFORMAT YOUR DATA 

AND' 



[11 

[123 

[133 

[143 

[153 

[163 

[173 

[183 

[19' 

[20 

[21 

[223 

[23" 

[24 

[25 

[26 

[27 



' REINITIATE GRAPHER ' 

0 

LP 4 : NNN GRAPH ER3 M 
0 

LP3 # 

NC0L+~1* ( DATA) 

JITTER 

TRANSFORM 

RR:' DO YOU WANT TO CONTINUE AND PLOT? (ENTER Y OR N ) ' 
CPl«-d 

(GR1*'Y' )/0 

' WHAT MATRIX POSITION ARE YOU REPRODUCING? ' 

GR2+U 

' WHAT POSITION ON THE SCREEEN? « 

GR 3*0 
GARY 3 
SPA <- ' ' 
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[28] (.ANS3* 'Y' )/L 1 

[29] CFFPS*' a49X9Y;YS90 1919 . VSPAVSPAVNNNZ (.GR2 [2] );]9 
NNNKGR 2[1] );] ' 

[30] GRFRS+GRFRS , ' 9GP39LIJV91 1 1910 11 0 0 ' 

[31] RUN GRFRS 

C 3 2 " 

[3 3] GRFR+' PUVDATAZ; (GR2Z21 )1VDATAZ; (CF2[1] )]9.919<>9 
SPA-SPA - ' 

[34] GRFR+GRFR, 'VNNNtiGR 2 [2] ) ; 1VNNNI (GR2 [1] );]9CF39 

LIN-LIN 

91 1 1910 110 0' 

[35] LURUNGRFR 

[36] RR ~ 



FUNTION GRAPHER 3 

[0] NNN GRAPH ER3 M 
Cl." 

C2 

[3] 

C4" 

[5 
[ 6 ] 



o o o PC NOT MOVE OR ERASE ; GRAF ST AT FUNCTION HEADER 
A fl fl GRAF ST AT WILL NOT ADD A LINE TO THIS FUNCTION 
WITHOUT THIS HEADER 
DATA+M 

NCOL<- 1 * ( DATA ) 

JITTER 

_ _ . TRANSFORM 

[7 1 RR:' DO YOU WANT TO CONTINUE AND PLOT ? ( ENTER Y OR N ) ' 

[ 8 ] GRl+ft 

[9] -*(CP1*'Y' )/0 

[10] ' WHAT MATRIX POSITION ARE YOU REPRODUCING ? ' 

[11] GR2+U 

[12] LIMITS 

[13] 'WHAT POSITION ON THE SCREEEN? ' 

[14] GR3+U 

[15] GARY 3 

[16] SPA+' ' 



[17 
[18 

[19] 

[ 20 ] 
[ 21 ] 
[ 22 ] 

[23] 

[24] 

[25] RR 



(ANS3*'Y' )/L 1 

GRFRS -*- ' a 49X9Y ; YS90 1919.9SPA9SPA9MWC (GR2 [2] );] 
VNNNKGR 2[1] );] ' 

GRFRS+GRFRS , '9CF39PFW9PIW91 1 1910 110 0' 

PfW GRFRS 
HR 

LHGRFR+' a49PAM[j <CF2[2] )]9PM71[; (CF2[1] )]9.91 
9o9SP49SPA9' 

GRFR+GRFR ' 'NNNl(GR2l2l ) j ] 9JWIW C ( CP 2[1] );]9CF39PIW 
9PI7791 1 1910 110 0' 

FPA7 CPFP 



FUNTION PLOTQUERY 

[0] PLOTQUERY 

[1] ' ' 

[2] SPA+' ' 

[3] 'PC YCC fMM 1 A PLOT OF YOUR LOWESS SMOOTHED CURVE? ' 

[ 4 ] ' YES OR NO) ENTER NO IF NOT USING GRAF ST AT ' 

[5] PT+ 1 + D 

[6] +END ( PTx'Y ' ) 

[7] ' INPUT X AXIS LABEL' 

[8] XAXPS^D 

[9] ' INPUT Y AXIS LABEL ' 

[10] Y4XJS+D 

[11] PL1 ( ROBx'Y ' ) 

[12] PHDR+ * ROBUST LOWESS SMOOTHING : F= ' 3 >F 

[13] RPLT+' a 49X19 Y1 ;YS90 1919.*+ vAo®4ity9SP49PFPF9X4XFS9 

Y4XJS9219' 

[14] RPLT+RPLT, ' LINVLIN91 1 190 1 0 0' 



61 



:i5] 

:i6 _ 

: 17 

C18] 

[19 

! 20 ] 
! 21 ] 
! 2 2 D 
: 23 D 

[24] 

[25] 

[26] 

[27] 

[28] 
!29] 

[30] 

[31] 

[32] 

[33] 



F = ' 3>P 



RUN RPLT 
VIEW 
PL 2 

PLl'.PHDR*' NON -ROBUST LOWESS SMOOTHING', . - 
NRPLT+' a 49X19Y1;YS90 1919.*+ VAo®M9SPva9PPPF9XAXIS9 
YAXJS9219' 

NRPLT+NRPLT , ' LJW9LIiV9l 1 190 10 0' 

FPiV iVFPLP 
VIEW 

PL2:' DO YOU WANT A PLOT OF \ RESIDUALS | VS X? 

' ( YES OR NO)' 

QS 5-e-l + Q 
END (QS5*'Y> ) 

' DO YOU WANT THIS PLOT SMOOTHED ? ' 

' ( YES OR NO)' 

QS 6-e-l + Q 

XRESID+ ' | RESIDUALS ' 

PL 3 (QS6*'Y') 

X LOWS ( | RESY ) 

SRESPLT+ ' p a 19X9 ( | RESY ) ; YS90 



1919. *+ 7Ao®A?9SPA9SPA9XilXIS9XflESIZ>9» 



!34] 
!35] 
!36” 
; 3 7 
!38] 

!39] 

[401 

[4i: 

[42] 



SRESPLT+SRESPLT , ' 229LJW9LIW91 1 190 1 0 09' 

RUN SRESPLT 
PAUSE 
END 

PL 3 :RESPLT<- ' a a 19X9 ( ! RESY )90919. *+ VAo©M9SPA9SPA9XAXIS9 
XRESIDV ' 

RESPLT+RESPLT , ' 229LJiV9LIiV91 1 190 1 0 09' 

RUN RESPLT 

PAUSE 

END: 
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APPENDIX B 

SAMPLE PROGRAM EXECUTION 



1. BOXPLOTTED TABLES 

This program, as mentioned in Chapter III is executed by typing BOXPLOTAB, 
and answering the queries as follows : 

BOXPLOTAB 

IS YOUR DATA SET LOCATED IN THIS WORKSPACE? (.YES /NO ) 

YES 

ENTER THE NAME OF YOUR DATA SET 
0 : 

STOCK 

ENTER THE SCREEN LABEL 

ACTIVE STOCKS FOR THE WEEK ENDED AUG. 8 , 1986 

DO YOU HAVE A (NCOL * 20 CHARS ) MATRIX WITH THE NAMES OF 
COLUMNS Y/N? 

YES 

ENTER THE NAME OF THE MATRIX 

□ : 

STOCKC 

DO YOU HAVE A ( NROW x 1 5 CHARS ) MATRIX WITH THE NAMES OF 
ROWS Y/N? 

YES 

ENTER THE NAME OF THE MATRIX 

□ : 

STOCKR 

DOU YOU WANT THE DATA ORDERED BY THE FIRST COLUMN ? Y/N 
YES 



(AT THIS POINT THE BOXPLOTTED TABLES ARE DISPLAYED ON THE 
SCREEN ) 



ENTER Q TO QUIT 
ENTER E TO ERASE AND CONTINUE 
ENTER C TO COPY AND CONTINUE 
ENTER CE TO COPY . ERASE AND CONTINUE 
PRESS ENTER ONLY TO CONTINUE 
CE 

DO YOU WANT TO JOIN WITH LINES DATA POINTS OF THE SAME 
POSITION 

YES 

ENTER THE POSITION OF THE DATA POINT (ENTER 0 TO FINISH ) 

□ : 

1 

ENTER THE POSITION OF THE DATA POINT (ENTER 0 TO FINISH ) 

□ : 

0 
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ENTER Q TO QUIT 

ENTER E TO ERASE AND CONTINUE 

ENTER C TO COPY AND CONTINUE 

ENTER CE TO COPY . ERASE AND CONTINUE 

PRESS ENTER ONLY TO CONTINUE 

Q 



2. STARP PLOTS 

This program, as mentioned in Chapter III is executed by typing STARPLOT, 
and answering the queries as follows : 

STARPLOT 

TYPE ( S ) FOR STAR PLOT OR (P ) FOR PROFILE PLOT 
S 

IS YOUR DATA SET LOCATED IN THIS WORKSPACE? (YES /NO ) 

YES 

ENTER THE NAME OF YOUR DATA SET 
0 : 

AUTOS 

DO YOU HAVE A (NROW * 20 CHARS ) ARRAY WITH NAMES OF ROWS Y/N? 

YES 

ENTER THE NAME OF THE MATRIX OF NAMES 

□ : 

AUTO SR 

DO YOU HAVE A (NCOL * 20 CHARS ) MATRIX WITH THE NAMES OF 
COLUMNS Y/N? 

Y 

ENTER THE MATRIX WITH THE NAMES 
□ : 

AUTO SC 

DO YOU WANT ALL COLUMNS OF YOUR MATRIX OR SELECTED COL . 

ALL/ SEL? 

SEL 

ENTER AS A VECTOR THE SELECTED COLUMNS 

□ : 

r 12 

DO YOU WANT ALL THE ROWS OF YOUR MATRIX OR SELECTED ROWS 
(ALL /SEL) 

ALL 

HOW MANY VARIABLES DO YOU WANT TO HAVE TRANSFORMED ? 

TYPE 0 IF YOU WANT NONE 

□ : 

0 

ENTER NUMBER OF PLOTS PER SCREEN ( 3 4 OR 5 ) 

□ : 

5 

(AT THIS POINT THE STAR PLOT IS SHOWN ON THE SCREEN ) 
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ENTER Q TO QUIT 

ENTER E TO ERASE AND CONTINUE 

ENTER C TO COPY AND CONTINUE 

ENTER CE TO COPY . ERASE AND CONTINUE 

PRESS ENTER ONLY TO CONTINUE 

CE 



3. PROFILE PLOTS 

This program, as mentioned in Chapter III is executed by typing STARPLOT, 
and answering the queries as follows : 

STARPLOT 

TYPE (S ) FOR STAR PLOT OR (P) FOR PROFILE PLOT 
P 

IS YOUR DATA SET LOCATED IN THIS WORKSPACE? (YES /NO ) 

YES 

ENTER THE NAME OF YOUR DATA SET 

□ : 

AUTOS 

DO YOU HAVE A (NROW*20 CHARS ) ARRAY WITH NAMES OF ROWS Y/N? 

YES 

ENTER THE NAME OF THE MATRIX OF NAMES 

□ : 

AUTO SR 

DO YOU HAVE A (NCOLx2 0 CHARS ) MATRIX WITH THE NAMES OF 
COLUMNS Y/N? 

Y 

ENTER THE MATRIX WITH THE NAMES 

□ : 

AUTOSC 

DO YOU WANT ALL COLUMNS OF YOUR MATRIX OR SELECTED COL . 

ALL/SEL? 

SEL 

ENTER AS A VECTOR THE SELECTED COLUMNS 

□ : 

1 12 

DO YOU WANT ALL THE ROWS OF YOUR MATRIX OR SELECTED ROWS 
(ALL/SEL) 

ALL 

HOW MANY VARIABLES DO YOU WANT TO HAVE TRANSFORMED ? 

TYPE 0 IF YOU WANT NONE 

□ : 

0 

ENTER NUMBER OF PLOTS PER SCREEN ( 3 4 OR 5 ) 

□ : 

5 

(AT THIS POINT THE STAR PLOT IS SHOWN ON THE SCREEN ) 



ENTER Q TO QUIT 
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ENTER E TO ERASE AND CONTINUE 
ENTER C TO COPY AND CONTINUE 
ENTER CE TO COPY . ERASE AND CONTINUE 
PRESS ENTER ONLY TO CONTINUE 

CE 



4. CODED SCATTER PLOTS 

This program, as mentioned in Chapter III is executed by typing SCATPLOT, 
and answering the queries as follows : 



SCATPLOT 

IS YOUR DATA SET LOCATED IN THIS WORKSPACE? (YES /NO ) 

YES 

ENTER THE NAME OF YOUR DATA SET 

□ : 

AUTOS 

FROM NOW ON YOUR DATA SET WILL BE CALLED DATA (IN THIS PROGRAM ) 

ENTER THE SCREEN HEADER 
AUTOMOBILE DATA ; PRICE VSM.P.G. CITY 

ENTER THE PLOT HEADER 

USA = A , FOREIGN = F AND WEIGHT = SIZE OF LETTER 

ENTER THE COLUMN NUMBER FOR THE VARIABLE ON THE X-AXIS 
□ : 

1 

ENTER THE LABEL FOR THE X AXIS 
PRICE 

DO YOU WANT ALL THE VALUES OF X OR JUST A SUBSAMPLE OF IT (ALL/SUB ) 
ALL 

ENTER THE COLUMN NUMBER FOR THE VARIABLE ON THE Y-AXIS 

0: 

2 

ENTER THE LABEL FOR THE Y AXIS 
M.P.G. CITY 

DO YOU WANT ALL THE VALUES OF Y OR JUST A SUBSAMPLE OF IT (ALL/SUB ) 
ALL 

HOW MANY VARIABLES DO YOU DESIRE JITTERED? 

TYPE 0 IF YOU WANT NONE 

□ : 

0 

HOW MANY VARIABLES DO YOU WANT TO HAVE TRANSFORMED ? 

TYPE 0 IF YOU WANT NONE 

□ : 

0 

ENTER THE POSITION FOR THE PLOT E.G. 1 21 22... 

□ : 

1 

111111111111111111111111111111 

ENTER IN AN APL EXPRESSION FOR THIS CATEGORY 
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I.E. (Mr^C;4]^.5)A(M2 , il[;8] = 5) 

USE DATA AS THE NAME OF YOUR VECTOR 
(DATAt; 13] =1 ) a (DATA [ : 8] <2 500 ) 

ENTER THE SYMBOL ( ANY LETTER , NUMBER OR SPECIAL CHARACTER ) 

A 

ENTER THE COLOR (WHITE , GREEN , BLUE , TURQUOISE ,RED , YELLOW OR PINK 
BLUE 

ENTER THE SIZE , AS A NUMBER BETWEEN 1 (SMALL ) AND 12 (BIG ) 

3 

ENTER A LABEL ( DESCRIPTION ) FOR THIS CATEGORY (MAX 2 5 CHARS. ) 

USA <, 2500 LB. 

DO YOU WANT ANOTHER CATEGORY (YES /NO ) 

YES 

222222222222222222222222222222 

ENTER IN AN APL EXPRESSION FOR THIS CATEGORY 

I.E. (MM[;4]^.5)A(Cm[;8] = 5) 

USE DATA AS THE NAME OF YOUR VECTOR 

( DATA [; 13 ]= 1 )a(( DATA [:8]>2500 )a (DATA C;8]^3000)) 

ENTER THE SYMBOL (ANY LETTER , NUMBER OR SPECIAL CHARACTER ) 

A 

ENTER THE COLOR (WHITE , GREEN , BLUE , TURQUOISE , RED , YELLOW OR PINK 
BLUE 

ENTER THE SIZE , AS A NUMBER BETWEEN 1 (SMALL) AND 12 (BIG) 

5 

ENTER A LABEL (DESCRIPTION) FOR THIS CATEGORY (MAX 2 5 CHARS. ) 
2500 < USA <> 3000 LB. 

DO YOU WANT ANOTHER CATEGORY (YES /NO ) 

YES 

333333333333333333333333333333 

ENTER IN AN APL EXPRESSION FOR THIS CATEGORY 

I.E. (MM[;4]^.5)A(flm[;8] = 5) 

USE DATA AS THE NAME OF YOUR VECTOR 

(DATA Cs 133=1 )a ( (DATA C ; 8] >3000 ) a ( DATA C;8]^3500)) 

ENTER THE SYMBOL (ANY LETTER , NUMBER OR SPECIAL CHARACTER ) 

A 

ENTER THE COLOR (WHITE , GREEN , BLUE , TURQUOISE , RED , YELLOW OR PINK 
RED 

ENTER THE SIZE , AS A NUMBER BETWEEN 1 (SMALL ) AND 12 (BIG) 

7 

ENTER A LABEL (DESCRIPTION) FOR THIS CATEGORY (MAX 2 5 CHARS . ) 
3000 < USA <> 3500 LB. 

DO YOU WANT ANOTHER CATEGORY (YES /NO ) 

YES 

444444444444444444444444444444 

ENTER IN AN APL EXPRESSION FOR THIS CATEGORY 

I.E. ( DATA [ ; 4 ] ^ . 5 )a(DATAZ', 8] = 5 ) 

USE DATA AS THE NAME OF YOUR VECTOR 
(DATA [;13]=1)a (DATA C;8]>3500) 

ENTER THE SYMBOL (ANY LETTER , NUMBER OR SPECIAL CHARACTER ) 

A 

ENTER THE COLOR (WHITE , GREEN , BLUE , TURQUOISE , RED , YELLOW OR PINK 
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ENTER THE SIZE , AS A NUMBER BETWEEN 1 (SMALL ) AND 12 ( BIG ) 

9 

ENTER A LABEL ( DESCRIPTION ) FOR THIS CATEGORY (MAX 2 5 CHARS . ) 

USA > 3500 LB. 

DO YOU WANT ANOTHER CATEGORY (YES /NO ) 

YES 

555555555555555555555555555555 

ENTER IN AN APL EXPRESSION FOR THIS CATEGORY 

I.E. (Dm[;4]$.5)AO)m[;8]:5) 

USE DATA AS THE NAME OF YOUR VECTOR 
(DATA [; 13] *1 )a (DATA [ ; 8] <2 500 ) 

ENTER THE SYMBOL (ANY LETTER , NUMBER OR SPECIAL CHARACTER ) 

F 

ENTER THE COLOR (WHITE , GREEN ,BLUE , TURQUOISE , RED , YELLOW OR PINK 
RED 

ENTER THE SIZE , AS A NUMBER BETWEEN 1 (SMALL ) AND 12 (BIG ) 

3 

ENTER A LABEL (DESCRIPTION') FOR THIS CATEGORY (MAX 2 5 CHARS. ) 
FOREIGN < 2 500 LB. 

DO YOU WANT ANOTHER CATEGORY ( YES /NO ) 

YES 

666666666666666666666666666666 

ENTER IN AN APL EXPRESSION FOR THIS CATEGORY 

I.E. (DATA [ ; 4] £ . 5 )a (DATA [ ; 8] =5 ) 

USE DATA AS THE NAME OF YOUR VECTOR 

(DATAL; 13] si ) a ( (DATA C ; 8] >2500 )a (DATA [;8]£3000)) 

ENTER THE SYMBOL (ANY LETTER , NUMBER OR SPECIAL CHARACTER ) 

F 

ENTER THE COLOR (WHITE , GREEN , BLUE , TURQUOISE , RED , YELLOW OR PINK 
RED 

ENTER THE SIZE , AS A NUMBER BETWEEN 1 (SMALL ) AND 12 (BIG ) 

5 ~ 

ENTER A LABEL (DESCRIPTION) FOR THIS CATEGORY (MAX 2 5 CHARS . ) 
2500 < FOREI . <, 3000 LB. 

DO YOU WANT ANOTHER CATEGORY (YES /NO ) 

YES 

777777777777777777777777777777 

ENTER IN AN APL EXPRESSION FOR THIS CATEGORY 

I.E. (DATA [ ; 4] ^ . 5 )a (DATA C ; 8] = 5 ) 

USE DATA AS THE NAME OF YOUR VECTOR 

(DATA [;13]*1)a( (DA.TA [;8]>3000)a (DATA [;8]^3500)) 

ENTER THE SYMBOL (ANY LETTER , NUMBER OR SPECIAL CHARACTER ) 

F 

ENTER THE COLOR (WHITE , GREEN , BLUE , TURQUOISE , RED , YELLOW OR PINK 
RED 

ENTER THE SIZE , AS A NUMBER BETWEEN 1 (SMALL ) AND 12 (BIG ) 

7 

ENTER A LABEL (DESCRIPTION) FOR THIS CATEGORY (MAX 2 5 CHARS. ) 

3 000 < FOREI. £ 3 500 LB. 

DO YOU WANT ANOTHER CATEGORY (YES /NO ) 

NO 
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5. CODED DRAFTSMAN PLOTS 

This program, as mentioned in Chapter III is executed by typing 
DRAFTSMAN, and answering the queries as follows : 



DRAFTSMAN 

IS YOUR DATA SET LOCATED IN THIS WORKSPACE? 

( YES OR NO) 

YES 

ENTER THE NAME OF YOUR DATA SET 

0: 

AUTOS 

DO YOU WANT ALL OF THIS DATA OR JUST A SUBSAMPLE OF IT TO 
BE PRESENTED IN THE DRAFTSMAN DISPLAY? ENTER ( ALL OR SUB ) 

ALL 

DO YOU HAVE A TWO DIMENSIONAL ARRAY OF NAMES FOR THE DATA 
WHICH IS TO BE DISPLAYED? NOTE: THESE NAMES ARE THE NAMES 
OF THE VARIABLES REPRSENTED BY THE COLUMNS OF YOUR DATA SET. 

( YES OR NO ) 

YES 

WHAT IS THE NAME OF YOUR ARRAY OF VARIABLE NAMES? 

□ : 

AUTO SC 

HOW MANY VARIABLES DO YOU DESIRE JITTERED? 

TYPE 0 IF YOU WANT NONE 
D: 

0 

HOW MANY VARIABLES DO YOU WANT TO HAVE TRANSFORMED ? 

TYPE 0 IF YOU WANT NONE 
D: 

0 

DO YOU WANT TO DO WANT TO FIT A SMOOTHED CURVE 
ON ALL DRAFT AM AN PLOTS? . . . (.YES OR NO ) 

NO 

DO YOU WANT A SYMBOLIC DRAFTSMAN (YES /NO ) 

YES 

ENTER AS A VECTOR THE VARIABLES (COLUMNS ) THAT YOU WHISH TO HAVE 
IN THE X AND Y AXIS (THE FIRST AND SECOND DIMENSION FOR THE PLOT ) 

0: 

1118 2 

NEXT, YOU HAVE TO ENTER APL EXPRESSION FOR EACH CATEGORY (CODE) 
USE XX THE NAME OF YOUR ARRAY 

I.E. (XX[;I]>100)a(XX[;J]=400) 

WHERE I AND J REPRESENT COLUMN NUMBERS BETWEEN 1 AND 4 
BE CAREFULLY NOT TO OVERLAP VALUES 

WHEN THE PROGRAM ASK FOR SYMBOLS TYPE ANY (ONE ) CHARACTER 
FOR COLORS TYPE THE NAME OF THE COLOR I.E. BLUE OR RED 
WITH SIZES 1 REPRESENT SMALL AND 12 BIG 

ENTER THE APL EXPRESSION FOR THE CATEGORY (CODE ) NUMBER 1 
XXC ; 13] =1 
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ENTER THE SYMBOL 

A 

ENTER THE COLOR 
RED 

ENTER THE SIZE 
4 

DO YOU WHISH ANOTHER CATEGORY ( YES /NO ) 

YES 

ENTER THE APL EXPRESSION FOR THE CATEGORY (CODE) NUMBER 2 
XX [ ; 1 3 ] * 1 

ENTER THE SYMBOL 
F 

ENTER THE COLOR 
RED 

ENTER THE SIZE 
4 

DO YOU WHISH ANOTHER CATEGORY (YES/NO ) 

NO 

YOU HAVE NOW * BASIC VARIABLES TO PLOT 
ENTER NUMBER OF PLOTS PER SCREEN ( 3 4 OR 5 ) 

□ : 

u 

DO YOU WANT TO FIT A SMOOTHED CURVE 
ON SELECTED PLOTS ? . . . (YES OR NO ) 

NO 



(AT THIS POINT THE CODED DRAFTSMAN PLOT IS SHOWN ON THE SCREEN ) 



ENTER Q TO QUIT 

ENTER E TO ERASE AND CONTINUE 

ENTER C TO COPY AND CONTINUE 

ENTER CE TO COPY , ERASE AND CONTINUE 

PRESS ENTER ONLY TO CONTINUE 

CE 



70 



APPENDIX C 

STAR PLOTS OF AUTOMOBILE DATA 






i 

t 
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